The new C++17 standard brings many exciting new features. A smaller, more subtle improvement it brings is guaranteed copy elision. The keyword is guaranteed, as copy elision itself has always been part of the standard. Although it might not be a change as radical as, say, structured bindings, I’m very happy to see it made it into the standard.
Copy Elision
Before discussing what changed in the latest version of the standard, it might be useful to revisit the basics of copy elision as they are currently defined by the C++14 standard. If you already know what it is and how it works, feel free to skip this part.
Consider the little class below that I will use throughout the examples. It prints some information each time a constructor or destructor is invoked.
struct Foo {
Foo() { std::cout << "Constructed" << std::endl; }
Foo(const Foo &) { std::cout << "Copy-constructed" << std::endl; }
Foo(Foo &&) { std::cout << "Move-constructed" << std::endl; }
~Foo() { std::cout << "Destructed" << std::endl; }
};
Conceptually there are 3[^1] cases where the compiler is allowed to omit the copy/move construction of a class object, even if they have side effects.
Return Value Optimization
The most common technique for copy elision is return value optimization. If you return an object by value, the compiler is allowed to elude copying.
[…] in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value.
Named Return Value Optimization (NRVO)
Sometimes the distinction is made between regular and named RVO, though the optimization applies either way. The example below returns a named value foo
.
Foo f() {
Foo foo;
return foo;
}
int main() { Foo foo = f(); }
Return Value Optimization (RVO)
The regular case of RVO, as shown below, simply returns a temporary value.
Foo f() {
return Foo();
}
With the option -fno-elide-constructors
it is possible to tell the compiler to not perform copy elision. I’m using clang but gcc accepts the same flag. The samples output is given below and is identical for both RVO and NRVO.
$ clang++ foo.cpp -std=c++11 -fno-elide-constructors && ./a.out
Constructed
Move-constructed
Destructed
Move-constructed
Destructed
Destructed
$ clang++ foo.cpp -std=c++11 && ./a.out
Constructed
Destructed
The optimization saves two calls to the (move) constructor. The first one is the copying of the local object foo
into the temporary one for the return value of function f()
. The second is the copying of that temporary object into object foo
in the main
function.
Passing a Temporary by Value
A second common case is passing a temporary by value.
[…] when a temporary class object that has not been bound to a reference would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move
void f(Foo f) { std::cout << "Fn" << std::endl; }
int main() {
f(Foo());
}
$ clang++ foo.cpp -fno-elide-constructors -std=c++11 && ./a.out
Constructed
Move-constructed
Fn
Destructed
Destructed
$ clang++ foo.cpp -std=c++11 && ./a.out
Constructed
Fn
Destructed
Throwing and Catching Exceptions by Value
As of C++11, there are two more cases that involve exceptions. Copy elision is permitted for both throwing and catching exceptions by value.
[…] in a throw-expression, when the operand is the name of a non-volatile automatic object (other than a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one), the copy/move operation from the operand to the exception object can be omitted by constructing the automatic object directly into the exception object.
[…] when the exception-declaration of an exception handler declares an object of the same type (except for cv-qualification) as the exception object, the copy/move operation can be omitted by treating the exception-declaration as an alias for the exception object if the meaning of the program will be unchanged except for the execution of constructors and destructors for the object declared by the exception-declaration.
void f() {
Foo foo;
throw foo;
}
int main() {
try {
f();
} catch (Foo foo) {
std::cout << "Catch" << std::endl;
}
}
To my surprise however, neither clang nor gcc actually performed copy elision for this case. So far I haven’t figured out why…
$ clang++ foo.cpp -std=c++11 && ./a.out
Constructed
Move-constructed
Destructed
Copy-constructed
Catch
Destructed
Destructed
Guaranteed Copy Elision
The proposal for guaranteed copy elision highlights a few of the issues with the current state of affairs. The main problem is that, without guaranteed elision, you cannot get rid of the move and copy constructor, because elision might not take place. This prevents non-movable types from having functions that return by value, such as factories.
The sample below does not compile, even though we’ve seen in the previous section that with copy elision we don’t need the copy or the move constructor.
struct Foo {
Foo() { std::cout << "Constructed" << std::endl; }
Foo(const Foo &) = delete;
Foo(const Foo &&) = delete;
~Foo() { std::cout << "Destructed" << std::endl; }
};
Foo f() {
return Foo();
}
int main() {
Foo foo = f();
}
In C++17, the sample compiles just fine[^2] and has the same output as if the move and copy constuctor where there. How this is done is explained next.
Value Categories
The complete name of the proposal is “Guaranteed copy elision through simplified value categories”. To achieve guaranteed copy elision, the proposal suggest distinguishing between prvalue (pure rvalue) expressions and the temporary objects initialized by them. More specifically, a glvalue (generalized lvalue) is defined as the location of an object and a prvalue is defined as the initializer of the object.
If a prvalue is used as the initializer of an object with the same type, it initializes it directly. As a consequence, initializing the return value of a function with a temporary causes the value to be initialized directly, without a copy or a move. This means that the copy or move constructor of the object no longer needs to be accessible.
A second consequence is that nothing changes for NVRO in C++17 with guaranteed copy elision. This is because, as mentioned before, the change only involves prvalues. With NVRO, the named value is a glvalue. The authors of the paper acknowledge this but chose to leave it out of scope.
While we believe that reliable NRVO is an important feature to allow reasoning about performance, the cases where NRVO is possible are subtle and a simple guarantee is difficult to give.
Addendum: Translation Units
In the comments on this post on reddit, someone asked whether copy elision is limited to the same translation unit. User flitterio then showed with an example that copy elision happens regardless of translation unit boundaries. The return value is copied into the caller’s stack memory, even with the calling function not being present in the translation unit.
Addendum: Copy-List-Initialization
Evgeny Panasyuk pointed out that it’s actually possible to return non-movable values from functions in C++11. Provided that the object has a non-explicit constructor, copy-list-initialization guarantees that no copy or move takes place.
Foo f() {
return {};
}
int main() {
auto &&foo = f();
}
This works totally independent from, and has nothing to do with, copy elision.
clang++ foo.cpp -fno-elide-constructors -std=c++11 && ./a.out
Constructed
Destructed