0

The code review bug that gives me nightmares–the issue

 2 years ago
source link: https://ayende.com/blog/195202-A/challenge-the-code-review-bug-that-gives-me-nightmares-the-issue?Key=e64cb4bb-d601-4e1d-8018-583c2f8aad4d&utm_campaign=Feed%3A+AyendeRahien+%28Ayende+%40+Rahien%29
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Comments

Change the public interface to accept a span, and the caller can then manage memory as it wishes. However, that still wouldn't solve the issue because you still can have the race condition between the moment the cache expires but the array was returned and is copied to the span.

In this case the option might be not to solve it and simply allocate an array yourself. ArrayPools are primarily meant to resolve the issue of short-term buffer allocation, for instance when processing streams of information. In this case, you're holding onto the allocated buffers, possibly exhausting the pool (depending on the underlying implementation).

You can of course still use the ArrayPool while you're compiling the hash.

Kaotis
02 Nov 2021
17:49 PM

One option would be to wrap it in a class with a destructor and cache that class instead. But you have to be careful not to expose that array directly..

Dejan Grujić
02 Nov 2021
18:15 PM

You can wrap it in a class/struct and add a usage counter. But you'll have to decrease counter once you're done with the hash value. Memory is released not on eviction, but when counter is 0, which could happen from either eviction or after manual release. There's a risk of memory leak if you forget to release it, but it's much better than the "ghost" bug you described.

Scooletz
02 Nov 2021
23:14 PM

An array with 32 bytes has its own overhead (object header + length). Depending on the evictions and load the first approach could be to turn it into an object with 4 longs that are sufficient to carry the payload but do not have the length overhead. Then once they are no longer used, just discard them.

Add a counter similar to the slot in the Concurrent queue for each hash value, that represents both the state and the epoch. Then, with volatile access, assert before accessing and after that the value that was read was ok. The read must be a copy to preserve the order and value. This could be done with the array (alignments) but with the helper class described above as well. With the latter, the caching would be for storing values.

Unless this is for building a db with hundreds of files or an upload service, probably I'd not cache it at all though.

Scooletz
02 Nov 2021
23:18 PM

@dejan Who and when bumps up the counter? How would you save it from ABA problem? It would be interesting if you could elaborate.

Dalibor Čarapić
03 Nov 2021
07:42 AM

I see several potential fixes for this issue depending on the required optimization:
1) Make a copy of the byte[] array before putting it into cache or returning the cached version to the caller.
Pros: Minor changes to existing code
Cons: Extra allocations on every call
2) Change the signature of the ComputeHash to accept callback function accepting ReadOnlySpan (and returns void). Caller can not modify the array and can not take a reference to it. This also requires pausing evictions while inside this method.
Pros: Prevent extra allocations
Cons: More difficult to use. Pausing evictions might not be easy to implement.
3) Do not use ArrayPool but just store byte[] values in cache. Return ComputeHash would return the same byte[] array but wrapped in ReadOnlyMemory struct.
Pros: Only one allocation per file.
Cons: You still have one allocation per file instead of single arraypool.

Dejan Grujić
03 Nov 2021
09:03 AM

@Scooletz

For start, I'm not sure entire idea with wrapper class is a good one in this context because of extra allocations. If we go that way IMO ComputeHash should be renamed to suggest that we're borrowing something which has to be released.

We can then bump the counter in the ComputeHash. We'll also need some ReleaseHash, and both that and evict methods will call some ReturnHashToPoolIfNotUsed which will decrease the counter. But we'll have to put a lock on _cache to prevent getting value while we'll releasing it. So, extra allocations + locking, not that great.

We can avoid locking and counters if we know for sure that borrows are short-lived, i.e. that whoever needs these hash values will complete its job with them in let's say under 1s. If this can be guaranteed, we can remove values from _cache on eviction, thus preventing new borrowers from getting them, but not return bytes to the pool immediately. We'll do actual return few seconds later. We'll just need a background thread for cleanup and some way to keep ReleaseTimestamps and byte array.

But I'd ask these questions first:

  • How many hashes are we talking about.
  • Is it predictable size or grows while the app is running? If we won't go over few thousand entries, we can skip release altogether.
  • Do callers need hash values for short period of time or do they keep them longer, potentially for the entire app lifetime? If it's long-lease, shared pool is not that compatible with cache eviction anyway - creating copies might be better.
Oren Eini
04 Nov 2021
08:33 AM

Sebastiaan,

The actual issue here is that I'm holding the arrays for a long time, so they are likely to be in Gen2. Not holding on to them would result in me putting stuff in Gen2 which will be rarely cleaned up. This is especially true when you are operating at or near the cache limits.

At that point, certain behaviors (access pattern that is greater than the cache) will mean that you keep filling and dropping from the cache. Just enough time to put that on Gen2, which will then make overall memory usage far more problematic. 

Oren Eini
04 Nov 2021
08:37 AM

Kaotis,

Yes, that is an option, for sure. However, that leads to a far more subtle issue.

public class ReleaseArray
{
   public byte[] Buffer;
   
   ~ReleaseArray()
   {
       ArrayPool<byte>.Shared.Return(Buffer);
   }
}

public void DoSomething(string key)
{
   ReleaseArray ra = cache.Get(key);
   DoSomethingWithBuffer(ra.Buffer);
}

Note that in this case, we are exposed to the JIT deciding that we aren't using the ra after the call to DoSomethingWithBuffer and cleaning it up.

That ties to your "not expose the buffer", but that is hard to do. The issue is that even if we make DoSomethingWithBuffer as an instance method of ReleaseArray, the JIT may decide to inline it, resulting in the same behavior. Non trivial issue.

Oren Eini
04 Nov 2021
08:39 AM

Dejan,

That leaks to the ABA problem, however. Because the act of getting the instance to increment the counter may be racy with _decrementing the counter.

Oren Eini
04 Nov 2021
08:42 AM

Scooletz,

The scenario I actually have is caching of buffers that may be in the MB range, and I don't actually follow your approach. 4 longs vs. an array is the exact same scenario. You are now not using a buffer pool, and will have a lot of allocations. That is also something what we want to avoid.

Oren Eini
04 Nov 2021
08:46 AM

Dalibor,

1 & 2 - those are really complex solutions. In both cases, you need a  way to avoid evictions while we are either copying the values or calling the callback. That means that you are paying the cost always, which is something that we want to avoid.

3 - That would expose us to a situation where the working set is greater than the cache size, which will mean that the cache will just make things a LOT harder for us, instead of better.

Dejan Grujic
04 Nov 2021
12:58 PM

I did say locking would be needed, and more specifically locking on _cache object. Although, my sixth sense always tells "danger" when locking is involved.

lock( _cache ) { .... _cache.Get(file) ... counter++ } ...

ReturnHashToPoolIfNotUsed( OurHashClass ... ) { lock(_cache) { counter-- ... etc. } }

Oren Eini
04 Nov 2021
13:06 PM

Dejan,

Yep, and now you are paying the locking cost on every access, which isn't nice to have.

Join the conversation...


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK