ActiveRecord::Core "#find" now reuses "#find_by" cache key

When querying using #find or #find_by results are stored to cache. This helps Rails load some queries from the cache instead of overloading the database server. Each query is responsible for generating a cache key and storing the result in the cache. This causes some irregularities to appear.

Before

One small oversight was #find and #find_by(id: ...) using different cache keys. Both queries return the same result but do not store the result to the exact cache location.

Let’s look into how ActiveRecord Core works:

def find(*ids) # :nodoc:
  # We don't have cache keys for this stuff yet
  return super unless ids.length == 1
  return super if block_given? ||
                  primary_key.nil? ||
                  scope_attributes? ||
                  columns_hash.key?(inheritance_column) && !base_class?

  id = ids.first

  return super if StatementCache.unsupported_value?(id)

  key = primary_key

  statement = cached_find_by_statement(key) { |params|
    where(key => params.bind).limit(1)
  }

  record = statement.execute([id], connection)&.first
  unless record
    raise RecordNotFound.new("Couldn't find #{name} with '#{key}'=#{id}", name, key, id)
  end
  record
end

We can see here that the cache key is just for primary_key (which in most scenarios is "id").

Let’s go through the #find_by method that accepts a hash of attributes.

def find_by(*args) # :nodoc:
  return super if scope_attributes? || reflect_on_all_aggregations.any? ||
                  columns_hash.key?(inheritance_column) && !base_class?

  hash = args.first

  return super if !(Hash === hash) || hash.values.any? { |v|
    StatementCache.unsupported_value?(v)
  }

  return super unless hash.keys.all? { |k| columns_hash.has_key?(k.to_s) }

  keys = hash.keys

  statement = cached_find_by_statement(keys) { |params|
    wheres = keys.each_with_object({}) { |param, o|
      o[param] = params.bind
    }
    where(wheres).limit(1)
  }
  begin
    statement.execute(hash.values, connection)&.first
  rescue TypeError
    raise ActiveRecord::StatementInvalid
  end
end

The cache key here gets set to hash.keys which returns an array of the columns that find_by searches with.

Which is where the ambiguity arises. While #find returns the cache key "id", find_by returns the cache key ["id"].

After

Rails ActiveRecord::Core “#find” now reuses “#find_by” cache key. Both queries use the same cache location.

Query Cache Key find(123) [“id”] find_by(id: 123) [“id”] find_by(id: 123, foo: true) [“id”, “foo”]

It was a simple fix added to the #find method, which now pushes primary_key to an array.

def find(*ids) # :nodoc:
  # We don't have cache keys for this stuff yet
  return super unless ids.length == 1
  return super if block_given? || primary_key.nil? || scope_attributes?

  id = ids.first

  return super if StatementCache.unsupported_value?(id)

  cached_find_by([primary_key], [id]) ||
    raise(RecordNotFound.new("Couldn't find #{name} with '#{primary_key}'=#{id}", name, primary_key, id))
end

Minor tweaks to core libraries can lead to huge benefits across applications!

Before

After

Recommend

年货节？直播带货怎么完成年前最后的KPI？！

Quick recipe for improving designer’s motivation

Does “less but better” rule still apply to modern product design?

Are Today's Portraitists Better Than the Old Masters? — Evolve Artist

Aave推出Web 3社交媒体平台“Lens”

Building like it's 1984: A comprehensive guide to creating intuitive context men...

Polygon在红杉资本领投融资轮中募得4.5亿美元

How to get the most value out of your product roadmap

Product roadmaps vs data - Jeremy Levy - Mind the Product

Got questions about your product career?

About Joyk