How do you construct a std::string with an embedded null?

C++NullStdstring

C++ Problem Overview


If I want to construct a std::string with a line like:

std::string my_string("a\0b");

Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?

C++ Solutions


Solution 1 - C++

Since C++14

we have been able to create literal std::string

#include <iostream>
#include <string>

int main()
{
    using namespace std::string_literals;

    std::string s = "pl-\0-op"s;    // <- Notice the "s" at the end
                                    // This is a std::string literal not
                                    // a C-String literal.
    std::cout << s << "\n";
}
Before C++14

The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.

To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:

std::string   x("pq\0rs");   // Two characters because input assumed to be C-String
std::string   x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.

Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().

Also check out Doug T's answer below about using a vector<char>.

Also check out RiaD for a C++14 solution.

Solution 2 - C++

If you are doing manipulation like you would with a c-style string (array of chars) consider using

std::vector<char>

You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:

std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());

and you can use it in many of the same places you can use c-strings

printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';

Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.

Solution 3 - C++

I have no idea why you'd want to do such a thing, but try this:

std::string my_string("a\0b", 3);

Solution 4 - C++

https://stackoverflow.com/questions/237804/user-defined-literals-in-c11-a-much-needed-addition-or-making-c-even-more-b presents an elegant answer: Define

std::string operator "" _s(const char* str, size_t n) 
{ 
    return std::string(str, n); 
}

then you can create your string this way:

std::string my_string("a\0b"_s);

or even so:

auto my_string = "a\0b"_s;

There's an "old style" way:

#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string

then you can define

std::string my_string(S("a\0b"));

Solution 5 - C++

The following will work...

std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');

Solution 6 - C++

You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: https://stackoverflow.com/questions/10220401/c-string-literals-escape-character.

For example, I dropped this innocent looking snippet in the middle of a program

// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
	std::cerr << c;
	// 'Q' is way cooler than '\0' or '0'
	c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
	std::cerr << c;
}
std::cerr << "\n";

Here is what this program output for me:

Entering loop.
Entering loop.

vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ

That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.

You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.

Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.

Solution 7 - C++

In C++14 you now may use literals

using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3

Solution 8 - C++

Better to use std::vector<char> if this question isn't just for educational purposes.

Solution 9 - C++

anonym's answer is excellent, but there's a non-macro solution in C++98 as well:

template <size_t N>
std::string RawString(const char (&ch)[N])
{
  return std::string(ch, N-1);  // Again, exclude trailing `null`
}

With this function, RawString(/* literal */) will produce the same string as S(/* literal */):

std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;

Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:

std::string s = S("a\0b"); // ERROR!

...so it might be preferable to use:

#define std::string(s, sizeof s - 1)

Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.

Solution 10 - C++

I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.

CComBSTR(20,"mystring1\0mystring2\0")

Solution 11 - C++

Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:

std::string s("aab");
s.at(1) = '\0';

but if you do, all your friends will laugh at you, you will never find true happiness.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBillView Question on Stackoverflow
Solution 1 - C++Martin YorkView Answer on Stackoverflow
Solution 2 - C++Doug T.View Answer on Stackoverflow
Solution 3 - C++17 of 26View Answer on Stackoverflow
Solution 4 - C++anonymView Answer on Stackoverflow
Solution 5 - C++Andrew SteinView Answer on Stackoverflow
Solution 6 - C++David StoneView Answer on Stackoverflow
Solution 7 - C++RiaDView Answer on Stackoverflow
Solution 8 - C++Harold EkstromView Answer on Stackoverflow
Solution 9 - C++Kyle StrandView Answer on Stackoverflow
Solution 10 - C++Dil09View Answer on Stackoverflow
Solution 11 - C++JurneyView Answer on Stackoverflow