Moving via relocate-and-construct
source link: https://quuxplusone.github.io/blog/2023/12/06/relocate-and-construct/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Moving via relocate-and-construct
A slow-moving StackOverflow discussion thread
alerted me to this interesting application of P1144 std::relocate
(Godbolt):
On libc++ and MSVC, std::array<std::string, 1000>
is a trivially relocatable type;
but it’s not trivially move-constructible, because string
’s move-constructor has to null
out the source object. Moving a single string is tantamount to a memcpy
plus a memset
(to null out the source). So, move-constructing an array<string, 1000>
is tantamount
to a 1000-times-larger memcpy
plus a 1000-times-larger memset
. But no mainstream compiler
is smart enough to fission the loop body in that way.
P1144 std::relocate
gives us a name for array<string>
’s operation that is tantamount to memcpy
; and we already have a name
(“default constructor”) for array<string>
’s operation that is tantamount to memset
.
So — here’s the cool part — we can help our insufficiently-smart compiler see the right way
to factor this code by writing the move-construction as a relocate
plus a placement-new
.
using A = std::array<std::string, 1000>;
A *move_to_heap(A& a) {
// Compiler fails to fission the loop
return new A(std::move(a));
}
A *move_to_heap(A& a) {
// Compiler generates memcpy + memset
auto *p = new A(std::relocate(&a));
::new (&a) A;
return p;
}
Of course this trick isn’t general-purpose. It does work for non-trivially relocatable types
(for which std::relocate
rightly won’t generate a memcpy
); but it doesn’t compile for types that aren’t default-constructible,
nor does it work for types whose default constructor has unwanted side effects, or whose moved-from state is distinguishable
from their default-constructed state. (As usual, std::pmr::string
is an example.) Therefore
I wouldn’t expect to see this trick in generic code.
But this is a neat application of the P1144 high-level primitives (is that an oxymoron?)
to help the compiler see how to generate optimal code, without throwing up our hands and descending
into undefined-behavior-land. I can’t think of a “standard-compliant” way to get the memcpy
+memset
codegen in today’s C++ — but with P1144’s vocabulary, it’s straightforward.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK