Should I return std::strings?

C++ Problem Overview

I'm trying to use std::string instead of char* whenever possible, but I worry I may be degrading performance too much. Is this a good way of returning strings (no error checking for brevity)?

std::string linux_settings_provider::get_home_folder() {
    return std::string(getenv("HOME"));
}

Also, a related question: when accepting strings as parameters, should I receive them as const std::string& or const char*?

Thanks.

C++ Solutions

Solution 1 - C++

Return the string.

I think the better abstraction is worth it. Until you can measure a meaningful performance difference, I'd argue that it's a micro-optimization that only exists in your imagination.

It took many years to get a good string abstraction into C++. I don't believe that Bjarne Stroustroup, so famous for his conservative "only pay for what you use" dictum, would have permitted an obvious performance killer into the language. Higher abstraction is good.

Solution 2 - C++

Return the string, like everyone says.

when accepting strings as parameters, should I receive them as const std::string& or const char*?

I'd say take any const parameters by reference, unless either they're lightweight enough to take by value, or in those rare cases where you need a null pointer to be a valid input meaning "none of the above". This policy isn't specific to strings.

Non-const reference parameters are debatable, because from the calling code (without a good IDE), you can't immediately see whether they're passed by value or by reference, and the difference is important. So the code may be unclear. For const params, that doesn't apply. People reading the calling code can usually just assume that it's not their problem, so they'll only occasionally need to check the signature.

In the case where you're going to take a copy of the argument in the function, your general policy should be to take the argument by value. Then you already have a copy you can use, and if you would have copied it into some specific location (like a data member) then you can move it (in C++11) or swap it (in C++03) to get it there. This gives the compiler the best opportunity to optimize cases where the caller passes a temporary object.

For string in particular, this covers the case where your function takes a std::string by value, and the caller specifies as the argument expression a string literal or a char* pointing to a nul-terminated string. If you took a const std::string& and copied it in the function, that would result in the construction of two strings.

Solution 3 - C++

The cost of copying strings by value varies based on the STL implementation you're working with:

std::string under MSVC uses the short string optimisation, so that short strings (< 16 characters iirc) don't require any memory allocation (they're stored within the std::string itself), while longer ones require a heap allocation every time the string is copied.
std::string under GCC uses a reference counted implementation: when constructing a std::string from a char*, a heap allocation is done every time, but when passing by value to a function, a reference count is simply incremented, avoiding the memory allocation.

In general, you're better off just forgetting about the above and returning std::strings by value, unless you're doing it thousands of times a second.

re: parameter passing, keep in mind that there's a cost from going from char*->std::string, but not from going from std::string->char*. In general, this means you're better off accepting a const reference to a std::string. However, the best justification for accepting a const std::string& as an argument is that then the callee doesn't have to have extra code for checking vs. null.

Solution 4 - C++

Seems like a good idea.

If this is not part of a realtime software (like a game) but a regular application, you should be more than fine.

Remember, "Premature optimization is the root of all evil"

Solution 5 - C++

It's human nature to worry about performance especially when programming language supports low-level optimization. What we shouldn't forget as programmers though is that program performance is just one thing among many that we can optimize and admire. In addition to program speed we can find beauty in our own performance. We can minimize our efforts while trying to achieve maximum visual output and user-interface interactiveness. Do you think that could be more motivation that worrying about bits and cycles in a long run... So yes, return string:s. They minimize your code size, and your efforts, and make the amount of work you put in less depressing.

Solution 6 - C++

In your case Return Value Optimization will take place so std::string will not be copied.

Solution 7 - C++

Beware when you cross module boundaries.

Then it's best to return primitive types since C++ types are not necessarily binary compatible across even different versions of the same compiler.

Solution 8 - C++

I agree with the other posters, that you should use string.

But know, that depending on how aggressively your compiler optimizes temporaries, you will probably have some extra overhead (over using a dynamic array of chars). (Note: The good news is that in C++0a, the judicious use of rvalue references will not require compiler optimizations to buy efficiency here - and programmers will be able to make some additional performance guarantees about their code without relying on the quality of the compiler.)

In your situation, is the extra overhead worth introducing manual memory management? Most reasonable programmers would disagree - but if your application does end up having performance issues, the next step would be to profile your application - thus, if you do introduce complexity, you only do it once you have good evidence that it is needed to improve overall efficiency.

Someone mentioned that Return Value optimization (RVO) is irrelevant here - I disagree.

The standard text (C++03) on this reads (12.2):

[Begin Standard Quote]

> Temporaries of class type are created in various contexts: binding an rvalue to a reference (8.5.3), returning an rvalue (6.6.3), a conversion that creates an rvalue (4.1, 5.2.9, 5.2.11, 5.4), throwing an exception (15.1), entering a handler (15.3), and in some initializations (8.5). [Note: the lifetime of exception objects is described in 15.1. ] Even when the creation of the temporary object is avoided (12.8), all the semantic restrictions must be respected as if the temporary object was created. [Example: even if the copy constructor is not called, all the semantic restrictions, such as accessibility (clause 11), shall be satisfied. ]

[Example:

struct X {
X(int);
X(const X&);
˜X();
};
X f(X);
void g()
{
X a(1);
X b = f(X(2));
a = f(a);
}

> Here, an implementation might use a temporary in which to construct X(2) before passing it to f() using X’s copy-constructor; alternatively, X(2) might be constructed in the space used to hold the argument. Also, a temporary might be used to hold the result of f(X(2)) before copying it to b using X’s copyconstructor; alternatively, f()’s result might be constructed in b. On the other hand, the expression a=f(a) requires a temporary for either the argument a or the result of f(a) to avoid undesired aliasing of a. ]

[End Standard Quote]

Essentially, the text above says that you can possibly rely on RVO in initialization situations, but not in assignment situations. The reason is, when you are initializing an object, there is no way that what you are initializing it with could ever be aliased to the object itself (which is why you never do a self check in a copy constructor), but when you do an assignment, it could.

There is nothing about your code, that inherently prohibits RVO - but read your compiler documentation to ensure that you can truly rely on it, if you do indeed need it.

Solution 9 - C++

I agree with duffymo. You should make an understandable working application first and then, if there is a need, attack optimization. It is at this point that you will have an idea where the major bottlenecks are and will be able to more efficiently manage your time in making a faster app.

Solution 10 - C++

I agree with @duffymo. Don't optimize until you have measured, this holds double true when doing micro-optimizations. And always: measure before and after you've optimized, to see if you actually changed things to the better.

Solution 11 - C++

Return the string, it's not that big of a loss in term of performance but it will surely ease your job afterward.

Plus, you could always inline the function but most optimizer will fix it anyways.

Solution 12 - C++

If you pass a referenced string and you work on that string you don't need to return anything. ;)

Content Type	Original Author	Original Content on Stackoverflow
Question	Pedro d'Aquino	View Question on Stackoverflow
Solution 1 - C++	duffymo	View Answer on Stackoverflow
Solution 2 - C++	Steve Jessop	View Answer on Stackoverflow
Solution 3 - C++	jskinner	View Answer on Stackoverflow
Solution 4 - C++	kostia	View Answer on Stackoverflow
Solution 5 - C++	AareP	View Answer on Stackoverflow
Solution 6 - C++	Kirill V. Lyadvinsky	View Answer on Stackoverflow
Solution 7 - C++	Hans Malherbe	View Answer on Stackoverflow
Solution 8 - C++	Faisal Vali	View Answer on Stackoverflow
Solution 9 - C++	Brian	View Answer on Stackoverflow
Solution 10 - C++	JesperE	View Answer on Stackoverflow
Solution 11 - C++	Gab Royer	View Answer on Stackoverflow
Solution 12 - C++	Partial	View Answer on Stackoverflow