![](/style/images/good.png)
![](/style/images/bad.png)
Faster parsing for lower numbers for radix up to 16 (cont.) by gilescope · Pull...
source link: https://github.com/rust-lang/rust/pull/95399
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
New issue
Faster parsing for lower numbers for radix up to 16 (cont.) #95399
Merged
Conversation
( Continuation of #83371 )
With LingMan's change I think this is potentially ready.
r? @scottmcm
(rust-highfive has picked a reviewer for you, use r? to override)
added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label
A mate points out that we could pull that condition out into a separate function and slather it in tests (assuming that we slap an inline always on it). I really like that idea.
The idea here sounds good; I've left some details comments.
One additional request: Please make sure there are a bunch of #[test]
s in the core tests that hit around the edges of the unsafe-using paths here. We run those tests in MIRI, which should help get extra checking that the unchecked usage is sound.
A mate points out that we could pull that condition out into a separate function and slather it in tests (assuming that we slap an inline always on it). I really like that idea.
Are you planning on doing this before requesting final approval?
added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
labels
This comment has been minimized.
Yeah let's get those additional tests done before we hit merge. I want to be absolutely sure about this.
Have extracted can_not_overflow
into a fn without it hurting performance.
We don't want to make can_not_overflow
public do we? But I don't see a way of testing it if it's not a public fn as all the testing seems to be integration tests. Would like to make it public just for testing - is that a thing yet?
This comment has been minimized.
I guess I could make it a public function but add something like this:
#[doc(hidden)] #[unstable(issue = "none", feature = "std_internals")]
I realised we can use default() to get zero which seems a bit nicer.
Have added tests that test every type except u128 (as that's a tricky type to add 1 to MAX unless I pull in bignum).
Right I feel happier with that now - if anyone optimised that condition in an unsafe way the test would likely catch it.
So my assumption here is that Add
trait is doing an unchecked add in release mode (and a checked add in debug mode).
Arguably this latest change means you don't get a speedup in debug mode anymore, but you still do get it in release mode.
It does mean that should the tests not have caught some edge case it would create a panic in debug so this feels structurally safer. (and does not use the unsafe
keyword so must be safer right? ... right?).
This comment has been minimized.
Great build fails because it's really hard to figure out if a type is signed in rust... maybe it's not even possible at the moment without using FromStrRadixHelper ...
Arguably this latest change means you don't get a speedup in debug mode anymore, but you still do get it in release mode.
std
is always shipped in release mode, so because you're not using tricks like rustc_inherit_overflow_checks
they'll get the optimized parsing even in debug, so that's good.
It's just that there's a few CI builders that enable debug/overflow checks to check for mistakes in the implementations (otherwise the debug_assert!
s would be completely useless, for example).
So that test failure is concerning. What subtraction is overflowing in the "it shouldn't overflow" path?
H/T to @koute for pointing out an easier way to do is signed. Test failure is no more. It was just a poor is signed implementation (in the test only).
Thanks! This looks good, and the final iteration unsafe
(as some of the intermediate ones did) makes me that much more confident in it. Parsing not overflowing ought to be the normal case, so optimizing for it seems reasonable.
@bors r+
Commit 3ee7bb1 has been approved by
scottmcm
added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
labels
Test successful - checks-actions
Approved by: scottmcm
Pushing 4e1927d to master...
Finished benchmarking commit (4e1927d): comparison url.
Summary:
- Primary benchmarks: no relevant changes found
- Secondary benchmarks:
relevant regressions found
Regressions
(primary)
Regressions
(secondary)
Improvements
(primary)
Improvements
(secondary)
All
(primary)
count1 0 14 0 0 0
mean2 N/A 0.3% N/A N/A N/A
max N/A 0.5% N/A N/A N/A
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.
Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged
along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged
label to this PR.
@rustbot label: +perf-regression
Footnotes
-
number of relevant changes
-
the arithmetic mean of the percent change
Is there somewhere posted benches and\or godbolt where changes actually visible? Reading this PR, i didn't find anything, plus description don't help.
test name |original| new | diff
num::bench_i16_from_str | 34,523 | 32,617 | 1,906
num::bench_i16_from_str_radix_10 | 35,473 | 33,684 | 1,789
num::bench_i16_from_str_radix_16 | 36,120 | 36,503 | -383
num::bench_i16_from_str_radix_2 | 32,306 | 29,424 | 2,882
num::bench_i16_from_str_radix_36 | 35,363 | 37,374 | -2,011
num::bench_i32_from_str | 38,624 | 35,394 | 3,230
num::bench_i32_from_str_radix_10 | 38,111 | 35,580 | 2,531
num::bench_i32_from_str_radix_16 | 41,010 | 36,663 | 4,347
num::bench_i32_from_str_radix_2 | 33,377 | 30,813 | 2,564
num::bench_i32_from_str_radix_36 | 40,601 | 38,090 | 2,511
num::bench_i64_from_str | 37,498 | 33,493 | 4,005
num::bench_i64_from_str_radix_10 | 39,617 | 34,532 | 5,085
num::bench_i64_from_str_radix_16 | 43,148 | 36,469 | 6,679
num::bench_i64_from_str_radix_2 | 32,941 | 32,684 | 257
num::bench_i64_from_str_radix_36 | 43,826 | 43,657 | 169
num::bench_i8_from_str | 33,765 | 35,326 | -1,561
num::bench_i8_from_str_radix_10 | 34,707 | 37,216 | -2,509
num::bench_i8_from_str_radix_16 | 35,972 | 38,750 | -2,778
num::bench_i8_from_str_radix_2 | 31,983 | 32,801 | -818
num::bench_i8_from_str_radix_36 | 34,638 | 39,400 | -4,762
num::bench_u16_from_str | 35,388 | 33,973 | 1,415
num::bench_u16_from_str_radix_10 | 34,344 | 34,400 | -56
num::bench_u16_from_str_radix_16 | 37,189 | 37,744 | -555
num::bench_u16_from_str_radix_2 | 32,678 | 30,647 | 2,031
num::bench_u16_from_str_radix_36 | 37,825 | 37,711 | 114
num::bench_u32_from_str | 36,448 | 33,178 | 3,270
num::bench_u32_from_str_radix_10 | 36,605 | 33,275 | 3,330
num::bench_u32_from_str_radix_16 | 41,255 | 36,363 | 4,892
num::bench_u32_from_str_radix_2 | 30,849 | 31,820 | -971
num::bench_u32_from_str_radix_36 | 40,925 | 38,920 | 2,005
num::bench_u64_from_str | 38,061 | 30,945 | 7,116
num::bench_u64_from_str_radix_10 | 36,959 | 31,290 | 5,669
num::bench_u64_from_str_radix_16 | 42,033 | 37,195 | 4,838
num::bench_u64_from_str_radix_2 | 31,095 | 29,326 | 1,769
num::bench_u64_from_str_radix_36 | 42,702 | 40,697 | 2,005
num::bench_u8_from_str | 34,868 | 32,880 | 1,988
num::bench_u8_from_str_radix_10 | 35,923 | 33,983 | 1,940
num::bench_u8_from_str_radix_16 | 40,102 | 36,449 | 3,653
num::bench_u8_from_str_radix_2 | 33,904 | 33,384 | 520
num::bench_u8_from_str_radix_36 | 37,975 | 36,783 | 1,192
It's retrograde for radix_36 and i8 but everything else benefits.
We did have a godbolt a long while ago. I can try and set one up again. There's a bit of munging of the code to do to get something equivalent so it will take me a little time.
There's a followup PR being drafted to address the regression - I had an idea before going to sleep (we are wasting a mul).
Will link it once I raise it (should be ready in a day or two).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
merged-by-bors This PR was explicitly merged by bors perf-regression Performance regressions S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
None yet
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK