Fixes namespaced UUID generation for namespace IDs represented as strings by eri...
source link: https://github.com/rails/rails/pull/37682
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Summary
Updates the Digest::UUID.uuid_from_hash
implementation to return the correct UUID value if the uuid_namespace
parameter is provided as a String
instead of using one of the pre-defined constants.
Other Information
The patch is backwards-compatible with the previous behaviour of using the constants defined in Digest::UUID
as the namespace ID.
With the new behaviour, all the calls below should return the same value:
require "digest" require "active_support/core_ext/digest" Digest::UUID.uuid_v5(Digest::UUID::DNS_NAMESPACE, "www.widgets.com") => "21f7f8de-8051-5b89-8680-0195ef798b6a" Digest::UUID.uuid_v5("6BA7B810-9DAD-11D1-80B4-00C04FD430C8", "www.widgets.com") => "21f7f8de-8051-5b89-8680-0195ef798b6a" Digest::UUID.uuid_v5("6ba7b810-9dad-11d1-80b4-00c04fd430c8", "www.widgets.com") => "21f7f8de-8051-5b89-8680-0195ef798b6a"
Benchmarks
# frozen_string_literal: true require "bundler/inline" gemfile(true) do source "https://rubygems.org" git_source(:github) { |repo| "https://github.com/#{repo}.git" } gem "activesupport", "6.0.0" gem "benchmark-ips" end require "digest" require "active_support/core_ext/digest" # Your patch goes here. module Digest module UUID def self.new_uuid_from_hash(hash_class, uuid_namespace, name) if hash_class == Digest::MD5 version = 3 elsif hash_class == Digest::SHA1 version = 5 else raise ArgumentError, "Expected Digest::SHA1 or Digest::MD5, got #{hash_class.name}." end packed_namespace_uuid = uuid_namespace.length == 36 ? pack_uuid(uuid_namespace) : uuid_namespace hash = hash_class.new hash.update(packed_namespace_uuid) hash.update(name) ary = hash.digest.unpack("NnnnnN") ary[2] = (ary[2] & 0x0FFF) | (version << 12) ary[3] = (ary[3] & 0x3FFF) | 0x8000 "%08x-%04x-%04x-%04x-%04x%08x" % ary end def self.pack_uuid(uuid) uuid.scan(/(\h{8})-(\h{4})-(\h{4})-(\h{4})-(\h{4})(\h{8})/).flatten.map { |s| s.to_i(16) }.pack("NnnnnN") end end end # Enumerate some representative scenarios here. # # It is very easy to make an optimization that improves performance for a # specific scenario you care about but regresses on other common cases. # Therefore, you should test your change against a list of representative # scenarios. Ideally, they should be based on real-world scenarios extracted # from production applications. SCENARIOS = { "Binary namespace ID" => Digest::UUID::DNS_NAMESPACE, "Lower-cased namespace ID" => "6ba7b810-9dad-11d1-80b4-00c04fd430c8", "Upper-cased namespace ID" => "6BA7B810-9DAD-11D1-80B4-00C04FD430C8" } VALUE = "www.widgets.com" SCENARIOS.each_pair do |name, uuid_namespace| puts puts " #{name} ".center(80, "=") puts Benchmark.ips do |x| x.report("old_uuid_from_hash_md5") { Digest::UUID.uuid_from_hash(Digest::MD5, uuid_namespace, VALUE) } x.report("new_uuid_from_hash_md5") { Digest::UUID.new_uuid_from_hash(Digest::MD5, uuid_namespace, VALUE) } x.compare! end Benchmark.ips do |x| x.report("old_uuid_from_hash_sha1") { Digest::UUID.uuid_from_hash(Digest::SHA1, uuid_namespace, VALUE) } x.report("new_uuid_from_hash_sha1") { Digest::UUID.new_uuid_from_hash(Digest::SHA1, uuid_namespace, VALUE) } x.compare! end end
Results
ruby benchmark.rb
Fetching gem metadata from https://rubygems.org/..............
Resolving dependencies...
Using concurrent-ruby 1.1.5
Using i18n 1.7.0
Using minitest 5.13.0
Using thread_safe 0.3.6
Using tzinfo 1.2.5
Using zeitwerk 2.2.1
Using activesupport 6.0.0
Using benchmark-ips 2.7.2
Using bundler 2.0.2
============================= Binary namespace ID ==============================
Warming up --------------------------------------
old_uuid_from_hash_md5
14.605k i/100ms
new_uuid_from_hash_md5
14.328k i/100ms
Calculating -------------------------------------
old_uuid_from_hash_md5
166.129k (±10.3%) i/s - 832.485k in 5.074024s
new_uuid_from_hash_md5
163.218k (±10.5%) i/s - 816.696k in 5.080808s
Comparison:
old_uuid_from_hash_md5: 166128.7 i/s
new_uuid_from_hash_md5: 163218.5 i/s - same-ish: difference falls within error
Warming up --------------------------------------
old_uuid_from_hash_sha1
14.660k i/100ms
new_uuid_from_hash_sha1
14.050k i/100ms
Calculating -------------------------------------
old_uuid_from_hash_sha1
163.423k (±11.6%) i/s - 806.300k in 5.021001s
new_uuid_from_hash_sha1
159.354k (± 9.9%) i/s - 800.850k in 5.090466s
Comparison:
old_uuid_from_hash_sha1: 163422.6 i/s
new_uuid_from_hash_sha1: 159353.9 i/s - same-ish: difference falls within error
=========================== Lower-cased namespace ID ===========================
Warming up --------------------------------------
old_uuid_from_hash_md5
14.681k i/100ms
new_uuid_from_hash_md5
6.007k i/100ms
Calculating -------------------------------------
old_uuid_from_hash_md5
168.224k (± 8.2%) i/s - 836.817k in 5.012074s
new_uuid_from_hash_md5
68.417k (±19.1%) i/s - 324.378k in 5.033972s
Comparison:
old_uuid_from_hash_md5: 168224.3 i/s
new_uuid_from_hash_md5: 68417.4 i/s - 2.46x slower
Warming up --------------------------------------
old_uuid_from_hash_sha1
14.321k i/100ms
new_uuid_from_hash_sha1
5.992k i/100ms
Calculating -------------------------------------
old_uuid_from_hash_sha1
165.424k (± 7.6%) i/s - 830.618k in 5.055094s
new_uuid_from_hash_sha1
67.677k (±19.0%) i/s - 323.568k in 5.070097s
Comparison:
old_uuid_from_hash_sha1: 165423.8 i/s
new_uuid_from_hash_sha1: 67676.6 i/s - 2.44x slower
=========================== Upper-cased namespace ID ===========================
Warming up --------------------------------------
old_uuid_from_hash_md5
14.970k i/100ms
new_uuid_from_hash_md5
6.060k i/100ms
Calculating -------------------------------------
old_uuid_from_hash_md5
168.793k (± 8.5%) i/s - 838.320k in 5.012564s
new_uuid_from_hash_md5
68.246k (±19.0%) i/s - 327.240k in 5.082545s
Comparison:
old_uuid_from_hash_md5: 168793.3 i/s
new_uuid_from_hash_md5: 68245.5 i/s - 2.47x slower
Warming up --------------------------------------
old_uuid_from_hash_sha1
14.683k i/100ms
new_uuid_from_hash_sha1
5.972k i/100ms
Calculating -------------------------------------
old_uuid_from_hash_sha1
164.037k (± 9.5%) i/s - 822.248k in 5.063807s
new_uuid_from_hash_sha1
67.914k (±18.9%) i/s - 322.488k in 5.033210s
Comparison:
old_uuid_from_hash_sha1: 164037.2 i/s
new_uuid_from_hash_sha1: 67913.9 i/s - 2.42x slower
As can be seen from the benchmarks above, performance remains roughly the same for binary representations of namespace IDs, so no impact should be noted in this case.
For the String
version of namespace IDs, there's a performance impact due to the extra work required to process them, but given that the current implementation returns the wrong value back in those cases, it seems like it's a fair trade-off.
Closes #37681
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK