How true is "Want Speed? Pass by value"

C++C++11

C++ Problem Overview


Correct me if I'm wrong. Say I have:

struct X
{
    std::string mem_name;

    X(std::string name)
        : mem_name(std::move(name)) 
    {}
    ...
};
struct Y
{
    std::string mem_name;

    Y(const std::string &name)
        : mem_name(name) 
    {}
    ...
};

In X's ctor, name is obviously a copy of whatever argument got passed to X, X invokes the move ctor of std::string to initialize mem_name, right?

Let's call that a copy-then-move on X*; two operations: COPY, MOVE.

In Y's ctor, name is a const ref, which means there's no actual copy of the element because we're dealing directly with the argument passed from wherever Y's object needs to be created. But then we copied name to initialise mem_name in Y; one operation: COPY. Surely it should therefore be a lot faster (and preferable to me)?

In Scott Meyer's GN13 talk (around time-frame 8:10 and 8:56), he talks about "Want speed? Pass by value" and I was wondering is there any performance difference or loss in passing arguments (or strings to be precise) by reference and passing by value "in order to gain speed?"

I'm aware of the fact that passing arguments by value can be expensive, especially when dealing with large data.

Maybe (clearly?) there's something I'm missing from his talk?

C++ Solutions


Solution 1 - C++

The idea of "Want speed? Pass by value"(1) is that sometimes, the copy can be elided. Taking your classes X and Y, consider this usecase:

// Simulating a complex operation returning a temporary:
std::string foo() { return "a" + std::string("b"); }


struct X
{
  std::string mem_name;
  X(std::string name): mem_name(std::move(name)) {}
};

struct Y
{
  std::string mem_name;
  Y(const std::string &name): mem_name(name) {}
};


int main()
{
  X(foo());
  Y(foo());
}

Now let's analyse both the construction cases.

X first. foo() returns a temporary, which is used to initialise the object name. That object is then moved into mem_name. Notice that the compiler can apply Return Value Optimisation and construct the return value of foo() (actually even the return value of operator+) directly in the space of name. So no copying actually happens, only a move.

Now let's analyse Y. foo() returns a temporary again, which is bound to the reference name. Now there's no "externally supplied" space for the return value, so it has to be constructed in its own space and bound to the reference. It is then copied into mem_name. So we are doing a copy, no way around it.

In short, the outcome is:

  • If an lvalue is being passed in, both X and Y will perform a copy (X when initialising name, Y when initialising mem_name). In addition, X will perform a move (when initialising mem_name).

  • If an rvalue is being passed in, X will potentially only perform a move, while Y has to perform a copy.

Generally, a move is expected to be an operation whose time requirements are comparable to those of passing a pointer (which is what passing by reference does). So in effect, X is no worse than Y for lvalues, and better for rvalues.

Of course, it's not an absolute rule, and must be taken with a grain of salt. When in doubt, profile.


(1) The link is prone to being temporarily unavailable, and as of 11-12-2014, it seems broken (404). A copy of the contents (albeit with weird formatting) seems available at several blog sites:

Alternatively, the original content might be accessible through the wayback machine.

Also note that the topic has in general stirred up quite a discussion. Googling the paper title brings up a lot of follow-ups and counter-points. To list an example of one of these, there's "Want speed? Don't (always) pass by value" by SO member juanchopanza

Solution 2 - C++

There are no general rules for optmization. Pass-by-value can give some big wins in C++11 with move semantics, alongside copy elision.

If you really want speed, Profile your code.

Solution 3 - C++

If you really don't mind exposing non-references in your API (which SHOULD BE a sign that internally you will be copying/assigning given object) then using copies is ok.

Copy elision is faster than moving, and if it can't be elided (for various reasons, like too long call chain of dependent function calls), then C++ guarantees move semantics.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestioniamOgunyinkaView Question on Stackoverflow
Solution 1 - C++Angew is no longer proud of SOView Answer on Stackoverflow
Solution 2 - C++cdmhView Answer on Stackoverflow
Solution 3 - C++Red XIIIView Answer on Stackoverflow