4

Add midpoint function for all integers and floating numbers by Urgau · Pull Requ...

 1 year ago
source link: https://github.com/rust-lang/rust/pull/92048
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Contributor

@Urgau Urgau

commented

Dec 17, 2021

edited

This pull-request adds the midpoint function to {u,i}{8,16,32,64,128,size}, NonZeroU{8,16,32,64,size} and f{32,64}.

This new function is analog to the C++ midpoint function, and basically compute (a + b) / 2 with a rounding towards a negative infinity in the case of integers. Or simply said: midpoint(a, b) is (a + b) >> 1 as if it were performed in a sufficiently-large signed integral type.

Note that unlike the C++ function this pull-request does not implement this function on pointers (*const T or *mut T). This could be implemented in a future pull-request if desire.

Implementation

For f32 and f64 the implementation in based on the libcxx one. I originally tried many different approach but all of them failed or lead me with a poor version of the libcxx. Note that libstdc++ has a very similar one; Microsoft STL implementation is also basically the same as libcxx. It unfortunately doesn't seems like a better way exist.

For unsigned integers I created the macro midpoint_impl!, this macro has two branches:

  • The first one take $SelfT and is used when there is no unsigned integer with at least the double of bits. The code simply use this formula a + (b - a) / 2 with the arguments in the correct order and signs to have the good rounding.
  • The second branch is used when a $WideT (at least double of bits as $SelfT) is provided, using a wider number means that no overflow can occur, this greatly improve the codegen (no branch and less instructions).

For signed integers the code basically forwards the signed numbers to the unsigned version of midpoint by mapping the signed numbers to their unsigned numbers (ex: i8 [-128; 127] to [0; 255]) and vice versa.
I originally created a version that worked directly on the signed numbers but the code was "ugly" and not understandable. Despite this mapping "overhead" the codegen is better than my most optimized version on signed integers.

Note that in the case of unsigned numbers I tried to be smart and used #[cfg(target_pointer_width = "64")] to determine if using the wide version was better or not by looking at the assembly on godbolt. This was applied to u32, u64 and usize and doesn't change the behavior only the assembly code generated.

leonardo-m and xfix reacted with thumbs up emojiscottmcm reacted with heart emoji

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK