stringstream, string, and char* conversion confusion

C++StringMemoryStringstream

C++ Problem Overview


My question can be boiled down to, where does the string returned from stringstream.str().c_str() live in memory, and why can't it be assigned to a const char*?

This code example will explain it better than I can

#include <string>
#include <sstream>
#include <iostream>

using namespace std;

int main()
{
	stringstream ss("this is a string\n");

	string str(ss.str());

	const char* cstr1 = str.c_str();

	const char* cstr2 = ss.str().c_str();

	cout << cstr1	// Prints correctly
		<< cstr2;	// ERROR, prints out garbage

	system("PAUSE");
	
	return 0;
}

The assumption that stringstream.str().c_str() could be assigned to a const char* led to a bug that took me a while to track down.

For bonus points, can anyone explain why replacing the cout statement with

cout << cstr            // Prints correctly
    << ss.str().c_str() // Prints correctly
    << cstr2;           // Prints correctly (???)

prints the strings correctly?

I'm compiling in Visual Studio 2008.

C++ Solutions


Solution 1 - C++

stringstream.str() returns a temporary string object that's destroyed at the end of the full expression. If you get a pointer to a C string from that (stringstream.str().c_str()), it will point to a string which is deleted where the statement ends. That's why your code prints garbage.

You could copy that temporary string object to some other string object and take the C string from that one:

const std::string tmp = stringstream.str();
const char* cstr = tmp.c_str();

Note that I made the temporary string const, because any changes to it might cause it to re-allocate and thus render cstr invalid. It is therefor safer to not to store the result of the call to str() at all and use cstr only until the end of the full expression:

use_c_str( stringstream.str().c_str() );

Of course, the latter might not be easy and copying might be too expensive. What you can do instead is to bind the temporary to a const reference. This will extend its lifetime to the lifetime of the reference:

{
  const std::string& tmp = stringstream.str();   
  const char* cstr = tmp.c_str();
}

IMO that's the best solution. Unfortunately it's not very well known.

Solution 2 - C++

What you're doing is creating a temporary. That temporary exists in a scope determined by the compiler, such that it's long enough to satisfy the requirements of where it's going.

As soon as the statement const char* cstr2 = ss.str().c_str(); is complete, the compiler sees no reason to keep the temporary string around, and it's destroyed, and thus your const char * is pointing to free'd memory.

Your statement string str(ss.str()); means that the temporary is used in the constructor for the string variable str that you've put on the local stack, and that stays around as long as you'd expect: until the end of the block, or function you've written. Therefore the const char * within is still good memory when you try the cout.

Solution 3 - C++

In this line:

const char* cstr2 = ss.str().c_str();

ss.str() will make a copy of the contents of the stringstream. When you call c_str() on the same line, you'll be referencing legitimate data, but after that line the string will be destroyed, leaving your char* to point to unowned memory.

Solution 4 - C++

The std::string object returned by ss.str() is a temporary object that will have a life time limited to the expression. So you cannot assign a pointer to a temporary object without getting trash.

Now, there is one exception: if you use a const reference to get the temporary object, it is legal to use it for a wider life time. For example you should do:

#include <string>
#include <sstream>
#include <iostream>

using namespace std;

int main()
{
    stringstream ss("this is a string\n");

    string str(ss.str());

    const char* cstr1 = str.c_str();

    const std::string& resultstr = ss.str();
    const char* cstr2 = resultstr.c_str();

    cout << cstr1       // Prints correctly
        << cstr2;       // No more error : cstr2 points to resultstr memory that is still alive as we used the const reference to keep it for a time.

    system("PAUSE");

    return 0;
}

That way you get the string for a longer time.

Now, you have to know that there is a kind of optimisation called RVO that say that if the compiler see an initialization via a function call and that function return a temporary, it will not do the copy but just make the assigned value be the temporary. That way you don't need to actually use a reference, it's only if you want to be sure that it will not copy that it's necessary. So doing:

 std::string resultstr = ss.str();
 const char* cstr2 = resultstr.c_str();

would be better and simpler.

Solution 5 - C++

The ss.str() temporary is destroyed after initialization of cstr2 is complete. So when you print it with cout, the c-string that was associated with that std::string temporary has long been destoryed, and thus you will be lucky if it crashes and asserts, and not lucky if it prints garbage or does appear to work.

const char* cstr2 = ss.str().c_str();

The C-string where cstr1 points to, however, is associated with a string that still exists at the time you do the cout - so it correctly prints the result.

In the following code, the first cstr is correct (i assume it is cstr1 in the real code?). The second prints the c-string associated with the temporary string object ss.str(). The object is destroyed at the end of evaluating the full-expression in which it appears. The full-expression is the entire cout << ... expression - so while the c-string is output, the associated string object still exists. For cstr2 - it is pure badness that it succeeds. It most possibly internally chooses the same storage location for the new temporary which it already chose for the temporary used to initialize cstr2. It could aswell crash.

cout << cstr            // Prints correctly
    << ss.str().c_str() // Prints correctly
    << cstr2;           // Prints correctly (???)

The return of c_str() will usually just point to the internal string buffer - but that's not a requirement. The string could make up a buffer if its internal implementation is not contiguous for example (that's well possible - but in the next C++ Standard, strings need to be contiguously stored).

In GCC, strings use reference counting and copy-on-write. Thus, you will find that the following holds true (it does, at least on my GCC version)

string a = "hello";
string b(a);
assert(a.c_str() == b.c_str());

The two strings share the same buffer here. At the time you change one of them, the buffer will be copied and each will hold its separate copy. Other string implementations do things different, though.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGraphics NoobView Question on Stackoverflow
Solution 1 - C++sbiView Answer on Stackoverflow
Solution 2 - C++Jared OberhausView Answer on Stackoverflow
Solution 3 - C++fbreretoView Answer on Stackoverflow
Solution 4 - C++KlaimView Answer on Stackoverflow
Solution 5 - C++Johannes Schaub - litbView Answer on Stackoverflow