6

improve case conversion happy path by conradludgate · Pull Request #97046 · rust...

 2 years ago
source link: https://github.com/rust-lang/rust/pull/97046
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Conversation

Contributor

@conradludgate conradludgate commented 19 days ago

edited

Someone shared the source code for Go's string case conversion.

It features a hot path for ascii-only strings (although I assume for reasons specific to go, they've opted for a read safe hot loop).

I've borrowed these ideas and also kept our existing code to provide a fast path + seamless utf-8 correct path fallback.

(Naive) Benchmarks can be found here https://github.com/conradludgate/case-conv

For the cases where non-ascii is found near the start, the performance of this algorithm does fall back to original speeds and has not had any measurable speed loss

est31, lunabunn, Nilstrieb, 95ulisse, ibraheemdev, and Globidev reacted with heart emoji

rustbot

added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label

19 days ago

Collaborator

rust-highfive commented 19 days ago

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs to request review from a libs-api team reviewer. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

Collaborator

rust-highfive commented 19 days ago

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

rust-highfive

added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label

19 days ago

This comment has been minimized.

Contributor

Author

conradludgate commented 17 days ago

I have benchmarked this on unicode heavy texts too (a 3MB Russian text file) and have found marginally better performance in this change:

lowercase/russian       time:   [27.309 ms 27.339 ms 27.370 ms] *
lowercase/russian_std   time:   [28.642 ms 28.672 ms 28.703 ms]
uppercase/russian       time:   [40.350 ms 40.389 ms 40.429 ms] *
uppercase/russian_std   time:   [40.931 ms 41.055 ms 41.178 ms] 

So I am not worried about the initial ascii checks being negative to performance in strictly non-ascii contexts

Could you maybe try German or something like that. And put umlauts somewhere at the end? (Or just use random garbled text where the start is only ASCII and the end contains unicode

Contributor

Author

conradludgate commented 12 days ago

edited

And put umlauts somewhere at the end? (Or just use random garbled text where the start is only ASCII and the end contains unicode

This is basically when my original benchmarks do. Features 2 copies of Macbeth, one untouched only ascii, the other features 16 bytes of non-ascii at the end. This was to test how well it handled pure ascii with the seamless break out:

lowercase/ascii         time:   [18.715 us 18.755 us 18.814 us] *
lowercase/unicode       time:   [18.735 us 18.756 us 18.784 us] *

lowercase/ascii_std     time:   [283.71 us 284.09 us 284.53 us]
lowercase/unicode_std   time:   [285.11 us 285.61 us 286.21 us]

Performance it expected to be somewhere in the middle if the text contains a unicode character half way through.

Which is exactly what we see in the following results:

lowercase/ascii         time:   [19.084 us 19.341 us 19.700 us] *
lowercase/unicode       time:   [122.33 us 122.40 us 122.49 us] *

lowercase/ascii_std     time:   [285.41 us 288.32 us 292.74 us]
lowercase/unicode_std   time:   [284.82 us 285.10 us 285.52 us]
mohe2015 reacted with thumbs up emoji

Contributor

Author

conradludgate commented 11 days ago

r? @thomcc

Contributor

klensy commented 11 days ago

edited

Nice benches, but what about short strings?
I've expecting that this will be run on short string more times, that on strings over few kilobytes.

Contributor

Author

conradludgate commented 11 days ago

edited

Nice benches, but what about short strings? I've expecting that this will be run on short string more times, that on strings over few kilobytes.

Fair point. While still maybe unnatural, I ran the following benchmark

ascii.split('.').map(str::to_lowercase).collect::<Vec<_>>()

over my large text files, this way it's tested more against shorter random length strings. I got the following results:

current_std: average 591.62ms
this change: average 269.67ms

This most likely would improve by reducing the magic number


Reducing the magic number from 16 to 8 I got minimal improvements (-25%) over the small string bench, but significant regression in the large string bench (+90%)

Contributor

@thomcc thomcc left a comment

This mostly looks okay to me, and I expect it to be an improvement, but I'm not sure that uppercasing strings over 100 bytes long is actually the common case.

Contributor

thomcc commented 7 days ago

Do you have benchmarks of the current version?

Contributor

Author

conradludgate commented 7 days ago

Do you have benchmarks of the current version?

Long ascii string

new  time:   [35.890 us 35.972 us 36.063 us]
old  time:   [522.71 us 523.65 us 524.69 us]

Long string with unicode half way through

new  time:   [228.51 us 229.41 us 230.73 us]
old  time:   [546.95 us 547.98 us 549.07 us]     

Short ascii strings

new  time:   [144.01 us 144.34 us 144.73 us]
old  time:   [595.05 us 596.77 us 598.67 us]

Contributor

thomcc commented 7 days ago

edited

This looks good to me, then. No rollup because of possible perf impact if things change cases.

@bors r+ rollup=never

Contributor

klensy commented 7 days ago

edited

Wait, maybe squash 10 commits?

Contributor

klensy commented 7 days ago

This looks good to me, then. No rollup because of possible perf impact if things change cases.

@bors r+ rollup=never

bors sleep.

Contributor

thomcc commented 7 days ago

@bors r+ rollup=never

Contributor

bors commented 7 days ago

pushpin Commit d0f9930 has been approved by thomcc

bors

added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

7 days ago

Contributor

bors commented 7 days ago

hourglass Testing commit d0f9930 with merge 1851f08...

Contributor

bors commented 7 days ago

sunny Test successful - checks-actions
Approved by: thomcc
Pushing 1851f08 to master...

bors

added the merged-by-bors This PR was explicitly merged by bors label

7 days ago

bors

merged commit 1851f08 into

rust-lang:master 7 days ago

11 checks passed

rustbot

added this to the 1.63.0 milestone

7 days ago

Collaborator

rust-timer commented 6 days ago

Finished benchmarking commit (1851f08): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions crying_cat_face
(primary)
N/A N/A 0
Regressions crying_cat_face
(secondary)
0.1% 0.1% 1
Improvements tada
(primary)
N/A N/A 0
Improvements tada
(secondary)
-1.0% -1.0% 3
All crying_cat_facetada (primary) N/A N/A 0

Max RSS (memory usage)

Results

Cycles

This benchmark run did not return any relevant results for this metric.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change

  2. number of relevant changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

thomcc

Labels
merged-by-bors This PR was explicitly merged by bors S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects

None yet

Milestone

1.63.0

Development

Successfully merging this pull request may close these issues.

None yet

12 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK