Files
ubicloud/model.rb
Jeremy Evans fe2d670793 Avoid explicitly casting string arrays used for = ANY/!= ALL
This uses a new feature I developed for Sequel with Ubicloud in mind.
I found that explicit casts for array parameters used in this case
are not required, as long as PostgreSQL can implicitly cast elements of
the array to the type of the LHS of the expression.

Taking an example query from the query parameterization analysis:

```sql
DELETE FROM "applied_action_tag" WHERE (("tag_id" = $1) AND ("action_id" = ANY($2)))
```

In this case, $2 is a PostgreSQL array containing uuids.  As action_id is a
uuid type, PostgreSQL will assume $2 is uuid[], and things will work correctly.

It took me a while to figure this out, because PostgreSQL's behavior is
different when using ARRAY.  When not auto parameterizing, Sequel literalizes
arrays using ARRAY.  Initial work on auto parameterization used ARRAY and
parameterized the elements, and this let me to believe that explicitly casting
was necessary for arrays, when it turns out not to be.

As an example, this doesn't work (assuming $2 is a uuid string):

```sql
DELETE FROM "applied_action_tag" WHERE (("tag_id" = $1) AND ("action_id" = ANY(ARRAY[$2])))
```

For some reason, PostgreSQL thinks the array is text[] instead of
uuid[], even though the LHS is uuid.

The previous work around was to assume string arrays could be represented as
PostgreSQL text[] types.  I've used this successfully in other applications,
but in those, I wasn't using uuids or enum types.

As the majority of Ubicloud's usage in these cases is either uuid[] type or
enum array types, the assumption of text[] was not a good one for Ubicloud.
I previously added the Sequel.any_{type,uuid} methods to Ubicloud to make
it easier to fix the failing cases.  However, this is really a leaky
abstraction, and it's likely something that would trip up other developers.

By avoiding the explicit cast for string arrays used as parameters,
we can remove all usage of Sequel.any_{type,uuid} and have everything
work correctly.  This approach should be safe to run in all
environments.  I've made it so that the conversion is performed for
arrays with one or more elements only in frozen testing mode, since that is
how the query parameterization analysis is run and it can result in
a reduced number of distinct queries.  It can also catch potential issues
if the tests only test with single element arrays, but we are using multiple
element arrays in production.

By default, only arrays of two elements or more are use this conversion,
because PostgreSQL will use a more optimized query plan for a single element
value list than for a multiple element value list.

Over 10% of Ubicloud's distinct parameterized queries use this new feature
(126/1130).
2025-04-30 06:05:53 +09:00

91 lines
2.8 KiB
Ruby

# frozen_string_literal: true
require_relative "db"
require "sequel/model"
if ENV["RACK_ENV"] == "development"
Sequel::Model.cache_associations = false
end
Sequel::Model.plugin :auto_validations, skip_invalid: true
Sequel::Model.plugin :require_valid_schema
Sequel::Model.plugin :singular_table_names
Sequel::Model.plugin :subclasses unless ENV["RACK_ENV"] == "development"
Sequel::Model.plugin :column_encryption do |enc|
enc.key 0, Config.clover_column_encryption_key
end
Sequel::Model.plugin :many_through_many
Sequel::Model.plugin :insert_conflict
Sequel::Model.plugin :inspect_pk
Sequel::Model.plugin :static_cache_cache, "cache/static_cache.cache"
Sequel::Model.plugin :pg_auto_constraint_validations, cache_file: "cache/pg_auto_constraint_validations.cache"
Sequel::Model.plugin :pg_auto_validate_enums, message: proc { |valid_values| "is not one of the supported values (#{valid_values.sort.join(", ")})" }
Sequel::Model.plugin :pg_eager_any_typed_array
if (level = Config.database_logger_level) || Config.test?
require "logger"
LOGGER = Logger.new($stdout, level: level || "fatal")
DB.loggers << LOGGER
end
if ENV["CHECK_LOGGED_SQL"]
require "logger"
File.unlink("sql.log") if File.file?("sql.log")
f = File.open("sql.log", "ab")
# Remove optimization that does not use parameterization
def (Sequel::Model).reset_fast_pk_lookup_sql = nil
# Hack to make specs pass that mock Time.now and depend
# on certain number of Time.now calls
time = Time.now
def time.now
self
end
Logger.const_set(:Time, time)
sql_logger = Logger.new(f, level: :INFO)
sql_logger.formatter = proc do |sev, _, _, msg|
"#{sev} -- : #{msg}\0"
end
DB.loggers << sql_logger
end
module SequelExtensions
def delete(force: false, &)
# Do not error if this is a plain dataset that does not respond to destroy
return super(&) unless respond_to?(:destroy)
caller_lines = caller
rodauth_in_callstack = !caller_lines.grep(/rodauth/).empty?
destroy_in_callstack = !caller_lines.grep(/sequel\/model\/base.*_destroy_delete/).empty?
# This can happen when fast instance deletes are disabled (when CHECK_LOGGED_SQL
# environment variable is set)
callee_in_callstack = !caller_lines.grep(/#{Regexp.escape(__FILE__)}.*delete/).empty?
unless rodauth_in_callstack || destroy_in_callstack || callee_in_callstack || force
raise "Calling delete is discouraged as it skips hooks such as before_destroy, which " \
"we use to archive records. Use destroy instead. If you know what you are doing " \
"and still want to use delete, you can pass force: true to trigger delete."
end
if is_a?(Sequel::Dataset)
super(&)
else
super()
end
end
end
module Sequel
class Dataset
prepend SequelExtensions
end
class Model
prepend SequelExtensions
end
end