ActiveRecord::Core "#find" now reuses "#find_by" cache key
source link: https://blog.saeloun.com/2022/02/09/rails-prevent-duplicates-in-find_by-cache
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
When querying using #find
or #find_by
results are stored to cache.
This helps Rails load some queries from the cache instead of overloading the database server.
Each query is responsible for generating a cache key and storing the result in the cache.
This causes some irregularities to appear.
Before
One small oversight was #find
and #find_by(id: ...)
using different cache keys.
Both queries return the same result but do not store the result to the exact cache location.
Let’s look into how ActiveRecord Core works:
def find(*ids) # :nodoc:
# We don't have cache keys for this stuff yet
return super unless ids.length == 1
return super if block_given? ||
primary_key.nil? ||
scope_attributes? ||
columns_hash.key?(inheritance_column) && !base_class?
id = ids.first
return super if StatementCache.unsupported_value?(id)
key = primary_key
statement = cached_find_by_statement(key) { |params|
where(key => params.bind).limit(1)
}
record = statement.execute([id], connection)&.first
unless record
raise RecordNotFound.new("Couldn't find #{name} with '#{key}'=#{id}", name, key, id)
end
record
end
We can see here that the cache key is just for primary_key
(which in most scenarios is "id"
).
Let’s go through the #find_by
method that accepts a hash of attributes.
def find_by(*args) # :nodoc:
return super if scope_attributes? || reflect_on_all_aggregations.any? ||
columns_hash.key?(inheritance_column) && !base_class?
hash = args.first
return super if !(Hash === hash) || hash.values.any? { |v|
StatementCache.unsupported_value?(v)
}
return super unless hash.keys.all? { |k| columns_hash.has_key?(k.to_s) }
keys = hash.keys
statement = cached_find_by_statement(keys) { |params|
wheres = keys.each_with_object({}) { |param, o|
o[param] = params.bind
}
where(wheres).limit(1)
}
begin
statement.execute(hash.values, connection)&.first
rescue TypeError
raise ActiveRecord::StatementInvalid
end
end
The cache key here gets set to hash.keys
which returns an
array of the columns that find_by
searches with.
Which is where the ambiguity arises.
While #find
returns the cache key "id"
,
find_by
returns the cache key ["id"]
.
After
Rails ActiveRecord::Core “#find” now reuses “#find_by” cache key. Both queries use the same cache location.
Query Cache Key find(123) [“id”] find_by(id: 123) [“id”] find_by(id: 123, foo: true) [“id”, “foo”]
It was a simple fix added to the #find
method,
which now pushes primary_key
to an array.
def find(*ids) # :nodoc:
# We don't have cache keys for this stuff yet
return super unless ids.length == 1
return super if block_given? || primary_key.nil? || scope_attributes?
id = ids.first
return super if StatementCache.unsupported_value?(id)
cached_find_by([primary_key], [id]) ||
raise(RecordNotFound.new("Couldn't find #{name} with '#{primary_key}'=#{id}", name, primary_key, id))
end
Minor tweaks to core libraries can lead to huge benefits across applications!
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK