Why is value taking setter member functions not recommended in Herb Sutter's CppCon 2014 talk (Back to Basics: Modern C++ Style)?

C++C++11Stl

C++ Problem Overview


In Herb Sutter's CppCon 2014 talk Back to Basics: Modern C++ Style he refers on slide 28 (a web copy of the slides are here) to this pattern:

class employee {
  std::string name_;
public:
  void set_name(std::string name) noexcept { name_ = std::move(name); }
};

He says that this is problematic because when calling set_name() with a temporary, noexcept-ness isn't strong (he uses the phrase "noexcept-ish").

Now, I have been using the above pattern pretty heavily in my own recent C++ code mainly because it saves me typing two copies of set_name() every time - yes, I know that can be a bit inefficient by forcing a copy construction every time, but hey I am a lazy typer. However Herb's phrase "This noexcept is problematic" worries me as I don't get the problem here: std::string's move assignment operator is noexcept, as is its destructor, so set_name() above seems to me to be guaranteed noexcept. I do see a potential exception throw by the compiler before set_name() as it prepares the parameter, but I am struggling to see that as problematic.

Later on on slide 32 Herb clearly states the above is an anti-pattern. Can someone explain to me why lest I have been writing bad code by being lazy?

C++ Solutions


Solution 1 - C++

Others have covered the noexcept reasoning above.

Herb spent much more time in the talk on the efficiency aspects. The problem isn't with allocations, its with unnecessary deallocations. When you copy one std::string into another the copy routine will reuse the allocated storage of the destination string if there's enough space to hold the data being copied. When doing a move assignment the destination string's existing storage must be deallocated as it takes over the storage from the source string. The "copy and move" idiom forces the deallocation to always occur, even when you don't pass a temporary. This is the source of the horrible performance that is demonstrated later in the talk. His advice was to instead take a const ref and if you determine that you need it have an overload for r-value references. This will give you the best of both worlds: copy into existing storage for non-temporaries avoiding the deallocation and move for temporaries where you're going to pay for a deallocation one way or the other (either the destination deallocates prior to the move or the source deallocates after the copy).

The above doesn't apply to constructors since there's no storage in the member variable to deallocate. This is nice since constructors often take more than one argument and if you need to do const ref/r-value ref overloads for each argument you end up with a combinatorial explosion of constructor overloads.

The question now becomes: how many classes are there that reuse storage like std::string when copying? I'm guessing that std::vector does, but outside of that I'm not sure. I do know that I've never written a class that reuses storage like this, but I have written a lot of classes that contain strings and vectors. Following Herb's advice won't hurt you for classes that don't reuse storage, you'll be copying at first with the copying version of the sink function and if you determine that the copying is too much of a performance hit you'll then make an r-value reference overload to avoid the copy (just as you would for std::string). On the other hand, using "copy-and-move" does have a demonstrated performance hit for std::string and other types that reuse storage, and those types probably see a lot of use in most peoples code. I'm following Herb's advice for now, but need to think through some of this a bit more before I consider the issue totally settled (there's probably a blog post that I don't have time to write lurking in all this).

Solution 2 - C++

There were two reasons considered for why passing by value might be better than passing by const reference.

  1. more efficient
  2. noexcept

In the case of setters for members of type std::string, he debunked the claim that passing by value was more efficient by showing that passing by const reference typically produced fewer allocations (at least for std::string).

He also debunked the claim that it allows the setter be noexcept by showing that the noexcept declaration is misleading, since an exception can still occur in the process of copying the parameter.

He thus concluded that passing by const reference was to be preferred over passing by value, at least in this case. However, he did mention that passing by value was a potentially good approach for constructors.

I do think that the example for std::string alone is not sufficient to generalize to all types, but it does call into question the practice of passing expensive-to-copy but cheap-to-move parameters by value, at least for efficiency and exception reasons.

Solution 3 - C++

Herb has a point, in that taking by-value when you already have storage allocated within can be inefficient and cause a needless allocation. But taking by const& is almost as bad, as if you take a raw C string and pass it to the function, a needless allocation occurs.

What you should take is the abstraction of reading from a string, not a string itself, because that is what you need.

Now, you can do this as a template:

class employee {
  std::string name_;
public:
  template<class T>
  void set_name(T&& name) noexcept { name_ = std::forward<T>(name); }
};

which is reasonably efficient. Then add some SFINAE maybe:

class employee {
  std::string name_;
public:
  template<class T>
  std::enable_if_t<std::is_convertible<T,std::string>::value>
  set_name(T&& name) noexcept { name_ = std::forward<T>(name); }
};

so we get errors at the interface and not the implementation.

This isn't always practical, as it requires exposing the implementation publically.

This is where a string_view type class can come in:

template<class C>
struct string_view {
  // could be private:
  C const* b=nullptr;
  C const* e=nullptr;

  // key component:
  C const* begin() const { return b; }
  C const* end() const { return e; }

  // extra bonus utility:
  C const& front() const { return *b; }
  C const& back() const { return *std::prev(e); }

  std::size_t size() const { return e-b; }
  bool empty() const { return b==e; }

  C const& operator[](std::size_t i){return b[i];}

  // these just work:
  string_view() = default;
  string_view(string_view const&)=default;
  string_view&operator=(string_view const&)=default;

  // myriad of constructors:
  string_view(C const* s, C const* f):b(s),e(f) {}

  // known continuous memory containers:
  template<std::size_t N>
  string_view(const C(&arr)[N]):string_view(arr, arr+N){}
  template<std::size_t N>
  string_view(std::array<C, N> const& arr):string_view(arr.data(), arr.data()+N){}
  template<std::size_t N>
  string_view(std::array<C const, N> const& arr):string_view(arr.data(), arr.data()+N){}
  template<class... Ts>
  string_view(std::basic_string<C, Ts...> const& str):string_view(str.data(), str.data()+str.size()){}
  template<class... Ts>
  string_view(std::vector<C, Ts...> const& vec):string_view(vec.data(), vec.data()+vec.size()){}
  string_view(C const* str):string_view(str, str+len(str)) {}
private:
  // helper method:
  static std::size_t len(C const* str) {
    std::size_t r = 0;
    if (!str) return r;
    while (*str++) {
      ++r;
    }
    return r;
  }
};

such an object can be constructed directly from a std::string or a "raw C string" and nearly costlessly store what you need to know in order to produce a new std::string from it.

class employee {
  std::string name_;
public:
  void set_name(string_view<char> name) noexcept { name_.assign(name.begin(),name.end()); }
};

and as now our set_name has a fixed interface (not a perfect forward one), it can have its implementation not visible.

The only inefficiency is that if you pass in a C-style string pointer, you somewhat needlessly go over its size twice (first time looking for the '\0', second time copying them). On the other hand, this gives your target information about how big it has to be, so it can pre-allocate rather than re-allocate.

Solution 4 - C++

You have two ways of calling that methods.

  • With rvalue parameter, as long as the move constructor of the parameter type is noexcept there is no problem (in case of std::string most probably is noexcept), in any case would be better use a conditional noexcept (to be sure the parameters is noexcept)
  • With lvalue parameter, in this case the copy constructor of the parameter type would be called and almost sure it will need some allocation (that could throw).

In cases like this that the use could be miss used, it's better to avoid. The client of the class assume that no exception is throw as specified, but in valid, compilable, not suspicious C++11 could throw.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNiall DouglasView Question on Stackoverflow
Solution 1 - C++Brett HallView Answer on Stackoverflow
Solution 2 - C++Vaughn CatoView Answer on Stackoverflow
Solution 3 - C++Yakk - Adam NevraumontView Answer on Stackoverflow
Solution 4 - C++NetVipeCView Answer on Stackoverflow