Optimizing Stacked Borrows (part 1?): Cache locations of Tags in a Borrow Stack...

Contributor

saethlin commented on Dec 12, 2021 •

edited

Before this PR, a profile of Miri under almost any workload points quite squarely at these regions of code as being incredibly hot (each being ~40% of cycles):

This code is one of at least three reasons that stacked borrows analysis is super-linear: These are both linear in the number of borrows in the stack and they are positioned along the most commonly-taken paths.

I'm addressing the first loop (which is in Stack::find_granting) by adding a very very simple sort of LRU cache implemented on a VecDeque, which maps recently-looked-up tags to their position in the stack. For Untagged access we fall back to the same sort of linear search. But as far as I can tell there are never enough Untagged items to be significant.

I'm addressing the second loop by keeping track of the region of stack where there could be items granting Permission::Unique. This optimization is incredibly effective because Read access tends to dominate and many trips through this code path now skip the loop entirely.

These optimizations result in pretty enormous improvements:
Without raw pointer tagging, mse 34.5s -> 2.4s, serde1 5.6s -> 3.6s
With raw pointer tagging, mse 35.3s -> 2.4s, serde1 5.7s -> 3.6s

And there is hardly any impact on memory usage:
Memory usage on mse 844 MB -> 848 MB, serde1 184 MB -> 184 MB (jitter on these is a few MB).

cbeuw reacted with rocket emojiDrMeepster, ibraheemdev, Nilstrieb, cbeuw, and matthieu-m reacted with eyes emoji All reactions

saethlin commented on Dec 12, 2021 •

Recommend

资源分享| Tapdata 的 2.0 版，开源的 Live Data Platform 现已发布

Plantuml encoding in Rust using TDD

Guilty on all 12 charges against former Theranos exec Sunny Balwani

I write about tech for a living and these are the 3 Prime Day deals on my wish l...

Designing user interface aesthetics that sync with the natural environment

Intel acknowledges Chrome and Edge issues on Alder Lake processors, offers fixes...

Linux常用命令-83：chroot

config_type: add unstable_variant attribute by tommilligan · Pull Request #5379...

Fornjot - Weekly Dev Log - 2022-W26

Meta VR headsets to stop requiring Facebook accounts | TechSpot

About Joyk