5

Caching multimethod default value dispatch

 3 years ago
source link: https://insideclojure.org/2014/12/13/multimethod-default/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Caching multimethod default value dispatch

Ticket CLJ-1429 started from a tweet by Anton Logvinenko (@piecalculus) indicating that a heavily used multimethod was showing a cold cache.

I have seen this problem in my own code in the past (before I was working on Clojure itself) so I immediately saw the chance to find and kill a problem that has affected me. Anton did a phenomenal job feeding me everything I needed to quickly reproduce and detect the problem.

Backing up slightly, multimethods are functions that can dispatch to many possible method implementations based on the result of an initial dispatch function. For example:

;; Define the multimethod mm that dispatches on the function class
(defmulti mm class)

;; Define some method implementations based on different classes
(defmethod mm String [s] s)
(defmethod mm Long [l] l)

;; You can also define a fallthrough default for when nothing matches
(defmethod mm :default [v] v)

Invoking a multimethod involves evaluating two functions: the dispatch function (class in the example above) and then using the return value to select the matching method implementation to actually invoke. The backing MultiFn class maintains a methodTable map from dispatch value to IFn (the method). However, finding a match is not exactly a simple lookup. There are several other features of multimethods:

  1. Inheritance hierarchy matching - using either custom hierarchies or Java class hierarchies
  2. Preferences - in the case of multiple possible matches, a preference may be stated for resolution
  3. Default dispatch - a method to use if no match is found (marked with :default by default)

Due to the possibilities of multiple matches resolved by preferences and falling through to a default, there is some complicated logic that implements this decision. For performance, the decision reached is cached at the end. Because multimethods are open (new methods or preferences may be added at any time), this logic must deal with the concurrency issues and possibility of a change happening during invocation.

I’ve annotated the key method here outlining what happens (note this is prior to the fix):

// annotated version of part of MultiFn (as of Clojure 1.6.0)

final ReentrantReadWriteLock rw; // Protects the volatile state volatile IPersistentMap methodTable; // dispatch value -> IFn (altered by new defmethod) volatile IPersistentMap preferTable; // dispatch value -> IPersistentSet of dispatch value volatile IPersistentMap methodCache; // dispatch value -> IFn volatile Object cachedHierarchy;

// this method is called with the dispatchVal returned from the dispatch function and // is responsible for finding the correct function to call and caching that decision. private IFn findAndCacheBestMethod(Object dispatchVal) { rw.readLock().lock(); // get (concurrent) read lock on the state of the MultiFn Map.Entry bestEntry; // track best entry found so far // Remember the instances we see at start - we'll check these later to detect // that a new method has been added IPersistentMap mt = methodTable; IPersistentMap pt = preferTable; Object ch = cachedHierarchy; try { bestEntry = null; for(Object o : getMethodTable()) // walk through every entry in the method table { Map.Entry e = (Map.Entry) o; if(isA(dispatchVal, e.getKey())) // is this a possible match for dispatchVal? { // yes, this is a possibility // if we don't have a match yet or the new one dominates, keep it if(bestEntry == null || dominates(e.getKey(), bestEntry.getKey())) bestEntry = e; // if there are multiple matches and none dominate, this is an error if(!dominates(bestEntry.getKey(), e.getKey())) throw new IllegalArgumentException( String.format( "Multiple methods in multimethod '%s' match dispatch value: %s -> %s and %s, and neither is preferred", name, dispatchVal, e.getKey(), bestEntry.getKey())); } } // if no matching entry was found, return null indicating the :default // THIS IS THE BUG - we have not cached the result of this process so we must do it every time!! if(bestEntry == null) return null; } finally // undo read lock in any condition { rw.readLock().unlock(); }

//ensure basis has stayed stable throughout, else redo rw.writeLock().lock(); // now take *exclusive* write lock try { // check whether the methodTable/preferTable/cachedHierarchy are the same as when we // started this decision process. If so, we can proceed. Else we have to dump the cache and start over. if( mt == methodTable && pt == preferTable && ch == cachedHierarchy && cachedHierarchy == hierarchy.deref()) { // place decision in cache and return the IFn we chose methodCache = methodCache.assoc(dispatchVal, bestEntry.getValue()); return (IFn) bestEntry.getValue(); } else { resetCache(); return findAndCacheBestMethod(dispatchVal); } } finally { rw.writeLock().unlock(); // release the exclusive write lock } }

In the gist above you can see where I’ve noted the bug - if no matching method is found we do not alter the cache. That means that in the fallthrough case, the matching logic is done every time, which requires walking through every entry in the method table.

One situation where I see this come up as a frequent performance issue is when using clojure.walk with an update function, where there is often a default fall-through case.

Now here is the same method after the patch (now included in Clojure 1.7):

private IFn findAndCacheBestMethod(Object dispatchVal) { rw.readLock().lock(); Object bestValue; // remember the best match IFn IPersistentMap mt = methodTable; IPersistentMap pt = preferTable; Object ch = cachedHierarchy; try { Map.Entry bestEntry = null; for(Object o : getMethodTable()) { Map.Entry e = (Map.Entry) o; if(isA(dispatchVal, e.getKey())) { if(bestEntry == null || dominates(e.getKey(), bestEntry.getKey())) bestEntry = e; if(!dominates(bestEntry.getKey(), e.getKey())) throw new IllegalArgumentException( String.format( "Multiple methods in multimethod '%s' match dispatch value: %s -> %s and %s, and neither is preferred", name, dispatchVal, e.getKey(), bestEntry.getKey())); } } if(bestEntry == null) { // instead of returning null, look up the default dispatch IFn, // based on the default dispatch value (which is usually :default) bestValue = methodTable.valAt(defaultDispatchVal); if(bestValue == null) // if there is no default dispatch value, then we do really need to return null return null; } else bestValue = bestEntry.getValue(); } finally { rw.readLock().unlock(); }

//ensure basis has stayed stable throughout, else redo rw.writeLock().lock(); try { if( mt == methodTable && pt == preferTable && ch == cachedHierarchy && cachedHierarchy == hierarchy.deref()) { //place in cache methodCache = methodCache.assoc(dispatchVal, bestValue); return (IFn) bestValue; } else { resetCache(); return findAndCacheBestMethod(dispatchVal); } } finally { rw.writeLock().unlock(); } }

We now look up and alter the cache when the default branch is taken. This can make a dramatic difference in performance when you are using the :default case. Here is a simple test using a mixture of default and non-default values:

;; build a cycle of strings, longs, and keywords of requested size
;; the multimethod handles strings and longs but not keywords
;; time each rep where the multimethod is mapped over the cycle
(defn perf [reps size]
  (let [data (take size (cycle ["abc" 5 :k]))]
    (dotimes [_ reps]
      (time (doall (map mm data))))))

;; Before patch:
user=> (perf 5 100000)
"Elapsed time: 1301.262 msecs"
"Elapsed time: 928.888 msecs"
"Elapsed time: 942.905 msecs"
"Elapsed time: 858.513 msecs"
"Elapsed time: 832.314 msecs"

;; After patch:
user=> (perf 5 100000)
"Elapsed time: 134.169 msecs"
"Elapsed time: 28.859 msecs"
"Elapsed time: 45.452 msecs"
"Elapsed time: 13.189 msecs"
"Elapsed time: 13.42 msecs"

The JIT warms up in both cases but you can see that there is a dramatic performance boost here. This change was added in Clojure 1.7.0-alpha2.

Written on December 13, 2014

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK