7

Optimizing Stacked Borrows (part 1?): Cache locations of Tags in a Borrow Stack...

 2 years ago
source link: https://github.com/rust-lang/miri/pull/1935
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Contributor

@saethlin saethlin commented on Dec 12, 2021

edited

Before this PR, a profile of Miri under almost any workload points quite squarely at these regions of code as being incredibly hot (each being ~40% of cycles):

This code is one of at least three reasons that stacked borrows analysis is super-linear: These are both linear in the number of borrows in the stack and they are positioned along the most commonly-taken paths.

I'm addressing the first loop (which is in Stack::find_granting) by adding a very very simple sort of LRU cache implemented on a VecDeque, which maps recently-looked-up tags to their position in the stack. For Untagged access we fall back to the same sort of linear search. But as far as I can tell there are never enough Untagged items to be significant.

I'm addressing the second loop by keeping track of the region of stack where there could be items granting Permission::Unique. This optimization is incredibly effective because Read access tends to dominate and many trips through this code path now skip the loop entirely.

These optimizations result in pretty enormous improvements:
Without raw pointer tagging, mse 34.5s -> 2.4s, serde1 5.6s -> 3.6s
With raw pointer tagging, mse 35.3s -> 2.4s, serde1 5.7s -> 3.6s

And there is hardly any impact on memory usage:
Memory usage on mse 844 MB -> 848 MB, serde1 184 MB -> 184 MB (jitter on these is a few MB).

cbeuw reacted with rocket emojiDrMeepster, ibraheemdev, Nilstrieb, cbeuw, and matthieu-m reacted with eyes emoji All reactions

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK