How do I iterate over the words of a string?

C++StringSplit

C++ Problem Overview


I'm trying to iterate over the words of a string.

The string can be assumed to be composed of words separated by whitespace.

Note that I'm not interested in C string functions or that kind of character manipulation/access. Also, please give precedence to elegance over efficiency in your answer.

The best solution I have right now is:

#include <iostream>
#include <sstream>
#include <string>

using namespace std;

int main()
{
    string s = "Somewhere down the road";
    istringstream iss(s);

    do
    {
        string subs;
        iss >> subs;
        cout << "Substring: " << subs << endl;
    } while (iss);
}

Is there a more elegant way to do this?

C++ Solutions


Solution 1 - C++

I use this to split string by a delimiter. The first puts the results in a pre-constructed vector, the second returns a new vector.

#include <string>
#include <sstream>
#include <vector>
#include <iterator>

template <typename Out>
void split(const std::string &s, char delim, Out result) {
	std::istringstream iss(s);
	std::string item;
	while (std::getline(iss, item, delim)) {
		*result++ = item;
	}
}

std::vector<std::string> split(const std::string &s, char delim) {
	std::vector<std::string> elems;
    split(s, delim, std::back_inserter(elems));
	return elems;
}

Note that this solution does not skip empty tokens, so the following will find 4 items, one of which is empty:

std::vector<std::string> x = split("one:two::three", ':');

Solution 2 - C++

For what it's worth, here's another way to extract tokens from an input string, relying only on standard library facilities. It's an example of the power and elegance behind the design of the STL.

#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>

int main() {
    using namespace std;
    string sentence = "And I feel fine...";
    istringstream iss(sentence);
    copy(istream_iterator<string>(iss),
         istream_iterator<string>(),
         ostream_iterator<string>(cout, "\n"));
}

Instead of copying the extracted tokens to an output stream, one could insert them into a container, using the same generic copy algorithm.

vector<string> tokens;
copy(istream_iterator<string>(iss),
     istream_iterator<string>(),
     back_inserter(tokens));

... or create the vector directly:

vector<string> tokens{istream_iterator<string>{iss},
                      istream_iterator<string>{}};

Solution 3 - C++

A possible solution using Boost might be:

#include <boost/algorithm/string.hpp>
std::vector<std::string> strs;
boost::split(strs, "string to split", boost::is_any_of("\t "));

This approach might be even faster than the stringstream approach. And since this is a generic template function it can be used to split other types of strings (wchar, etc. or UTF-8) using all kinds of delimiters.

See the documentation for details.

Solution 4 - C++

#include <vector>
#include <string>
#include <sstream>

int main()
{
    std::string str("Split me by whitespaces");
    std::string buf;                 // Have a buffer string
    std::stringstream ss(str);       // Insert the string into a stream

    std::vector<std::string> tokens; // Create vector to hold our words

    while (ss >> buf)
        tokens.push_back(buf);

    return 0;
}

Solution 5 - C++

For those with whom it does not sit well to sacrifice all efficiency for code size and see "efficient" as a type of elegance, the following should hit a sweet spot (and I think the template container class is an awesomely elegant addition.):

template < class ContainerT >
void tokenize(const std::string& str, ContainerT& tokens,
              const std::string& delimiters = " ", bool trimEmpty = false)
{
   std::string::size_type pos, lastPos = 0, length = str.length();
   
   using value_type = typename ContainerT::value_type;
   using size_type  = typename ContainerT::size_type;
   
   while(lastPos < length + 1)
   {
      pos = str.find_first_of(delimiters, lastPos);
      if(pos == std::string::npos)
      {
         pos = length;
      }

      if(pos != lastPos || !trimEmpty)
         tokens.push_back(value_type(str.data()+lastPos,
               (size_type)pos-lastPos ));

      lastPos = pos + 1;
   }
}

I usually choose to use std::vector<std::string> types as my second parameter (ContainerT)... but list<> is way faster than vector<> for when direct access is not needed, and you can even create your own string class and use something like std::list<subString> where subString does not do any copies for incredible speed increases.

It's more than double as fast as the fastest tokenize on this page and almost 5 times faster than some others. Also with the perfect parameter types you can eliminate all string and list copies for additional speed increases.

Additionally it does not do the (extremely inefficient) return of result, but rather it passes the tokens as a reference, thus also allowing you to build up tokens using multiple calls if you so wished.

Lastly it allows you to specify whether to trim empty tokens from the results via a last optional parameter.

All it needs is std::string... the rest are optional. It does not use streams or the boost library, but is flexible enough to be able to accept some of these foreign types naturally.

Solution 6 - C++

Here's another solution. It's compact and reasonably efficient:

std::vector<std::string> split(const std::string &text, char sep) {
  std::vector<std::string> tokens;
  std::size_t start = 0, end = 0;
  while ((end = text.find(sep, start)) != std::string::npos) {
    tokens.push_back(text.substr(start, end - start));
    start = end + 1;
  }
  tokens.push_back(text.substr(start));
  return tokens;
}

It can easily be templatised to handle string separators, wide strings, etc.

Note that splitting "" results in a single empty string and splitting "," (ie. sep) results in two empty strings.

It can also be easily expanded to skip empty tokens:

std::vector<std::string> split(const std::string &text, char sep) {
    std::vector<std::string> tokens;
    std::size_t start = 0, end = 0;
    while ((end = text.find(sep, start)) != std::string::npos) {
        if (end != start) {
          tokens.push_back(text.substr(start, end - start));
        }
        start = end + 1;
    }
    if (end != start) {
       tokens.push_back(text.substr(start));
    }
    return tokens;
}


If splitting a string at multiple delimiters while skipping empty tokens is desired, this version may be used:

std::vector<std::string> split(const std::string& text, const std::string& delims)
{
	std::vector<std::string> tokens;
	std::size_t start = text.find_first_not_of(delims), end = 0;
	
	while((end = text.find_first_of(delims, start)) != std::string::npos)
	{
		tokens.push_back(text.substr(start, end - start));
		start = text.find_first_not_of(delims, end);
	}
	if(start != std::string::npos)
		tokens.push_back(text.substr(start));
	
	return tokens;
}

Solution 7 - C++

This is my favorite way to iterate through a string. You can do whatever you want per word.

string line = "a line of text to iterate through";
string word;

istringstream iss(line, istringstream::in);

while( iss >> word )     
{
    // Do something on `word` here...
}

Solution 8 - C++

This is similar to Stack Overflow question How do I tokenize a string in C++?. Requires Boost external library

#include <iostream>
#include <string>
#include <boost/tokenizer.hpp>

using namespace std;
using namespace boost;

int main(int argc, char** argv)
{
    string text = "token  test\tstring";

    char_separator<char> sep(" \t");
    tokenizer<char_separator<char>> tokens(text, sep);
    for (const string& t : tokens)
    {
        cout << t << "." << endl;
    }
}

Solution 9 - C++

I like the following because it puts the results into a vector, supports a string as a delim and gives control over keeping empty values. But, it doesn't look as good then.

#include <ostream>
#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;

vector<string> split(const string& s, const string& delim, const bool keep_empty = true) {
    vector<string> result;
    if (delim.empty()) {
        result.push_back(s);
        return result;
    }
    string::const_iterator substart = s.begin(), subend;
    while (true) {
        subend = search(substart, s.end(), delim.begin(), delim.end());
        string temp(substart, subend);
        if (keep_empty || !temp.empty()) {
            result.push_back(temp);
        }
        if (subend == s.end()) {
            break;
        }
        substart = subend + delim.size();
    }
    return result;
}

int main() {
    const vector<string> words = split("So close no matter how far", " ");
    copy(words.begin(), words.end(), ostream_iterator<string>(cout, "\n"));
}

Of course, Boost has a split() that works partially like that. And, if by 'white-space', you really do mean any type of white-space, using Boost's split with is_any_of() works great.

Solution 10 - C++

The STL does not have such a method available already.

However, you can either use C's strtok() function by using the std::string::c_str() member, or you can write your own. Here is a code sample I found after a quick Google search ("STL string split"):

void Tokenize(const string& str,
              vector<string>& tokens,
              const string& delimiters = " ")
{
    // Skip delimiters at beginning.
    string::size_type lastPos = str.find_first_not_of(delimiters, 0);
    // Find first "non-delimiter".
    string::size_type pos     = str.find_first_of(delimiters, lastPos);

    while (string::npos != pos || string::npos != lastPos)
    {
        // Found a token, add it to the vector.
        tokens.push_back(str.substr(lastPos, pos - lastPos));
        // Skip delimiters.  Note the "not_of"
        lastPos = str.find_first_not_of(delimiters, pos);
        // Find next "non-delimiter"
        pos = str.find_first_of(delimiters, lastPos);
    }
}

Taken from: <http://oopweb.com/CPP/Documents/CPPHOWTO/Volume/C++Programming-HOWTO-7.html>

If you have questions about the code sample, leave a comment and I will explain.

And just because it does not implement a typedef called iterator or overload the << operator does not mean it is bad code. I use C functions quite frequently. For example, printf and scanf both are faster than std::cin and std::cout (significantly), the fopen syntax is a lot more friendly for binary types, and they also tend to produce smaller EXEs.

Don't get sold on this "Elegance over performance" deal.

Solution 11 - C++

Here is a split function that:

  • is generic

  • uses standard C++ (no boost)

  • accepts multiple delimiters

  • ignores empty tokens (can easily be changed)

     template<typename T>
     vector<T> 
     split(const T & str, const T & delimiters) {
         vector<T> v;
     	typename T::size_type start = 0;
     	auto pos = str.find_first_of(delimiters, start);
     	while(pos != T::npos) {
     		if(pos != start) // ignore empty tokens
     			v.emplace_back(str, start, pos - start);
     		start = pos + 1;
     		pos = str.find_first_of(delimiters, start);
     	}
     	if(start < str.length()) // ignore trailing delimiter
     		v.emplace_back(str, start, str.length() - start); // add what's left of the string
         return v;
     }
     
    

Example usage:

    vector<string> v = split<string>("Hello, there; World", ";,");
    vector<wstring> v = split<wstring>(L"Hello, there; World", L";,");

Solution 12 - C++

I have a 2 lines solution to this problem:

char sep = ' ';
std::string s="1 This is an example";

for(size_t p=0, q=0; p!=s.npos; p=q)
  std::cout << s.substr(p+(p!=0), (q=s.find(sep, p+1))-p-(p!=0)) << std::endl;

Then instead of printing you can put it in a vector.

Solution 13 - C++

Yet another flexible and fast way

template<typename Operator>
void tokenize(Operator& op, const char* input, const char* delimiters) {
  const char* s = input;
  const char* e = s;
  while (*e != 0) {
    e = s;
    while (*e != 0 && strchr(delimiters, *e) == 0) ++e;
    if (e - s > 0) {
      op(s, e - s);
    }
    s = e + 1;
  }
}

To use it with a vector of strings (Edit: Since someone pointed out not to inherit STL classes... hrmf ;) ) :

template<class ContainerType>
class Appender {
public:
  Appender(ContainerType& container) : container_(container) {;}
  void operator() (const char* s, unsigned length) { 
    container_.push_back(std::string(s,length));
  }
private:
  ContainerType& container_;
};

std::vector<std::string> strVector;
Appender v(strVector);
tokenize(v, "A number of words to be tokenized", " \t");

That's it! And that's just one way to use the tokenizer, like how to just count words:

class WordCounter {
public:
  WordCounter() : noOfWords(0) {}
  void operator() (const char*, unsigned) {
    ++noOfWords;
  }
  unsigned noOfWords;
};

WordCounter wc;
tokenize(wc, "A number of words to be counted", " \t"); 
ASSERT( wc.noOfWords == 7 );

Limited by imagination ;)

Solution 14 - C++

Here's a simple solution that uses only the standard regex library

#include <regex>
#include <string>
#include <vector>

std::vector<string> Tokenize( const string str, const std::regex regex )
{
	using namespace std;

	std::vector<string> result;

	sregex_token_iterator it( str.begin(), str.end(), regex, -1 );
	sregex_token_iterator reg_end;

	for ( ; it != reg_end; ++it ) {
		if ( !it->str().empty() ) //token could be empty:check
			result.emplace_back( it->str() );
	}

	return result;
}

The regex argument allows checking for multiple arguments (spaces, commas, etc.)

I usually only check to split on spaces and commas, so I also have this default function:

std::vector<string> TokenizeDefault( const string str )
{
	using namespace std;

	regex re( "[\\s,]+" );

	return Tokenize( str, re );
}

The "[\\s,]+" checks for spaces (\\s) and commas (,).

Note, if you want to split wstring instead of string,

  • change all std::regex to std::wregex
  • change all sregex_token_iterator to wsregex_token_iterator

Note, you might also want to take the string argument by reference, depending on your compiler.

Solution 15 - C++

Using std::stringstream as you have works perfectly fine, and do exactly what you wanted. If you're just looking for different way of doing things though, you can use std::find()/std::find_first_of() and std::string::substr().

Here's an example:

#include <iostream>
#include <string>

int main()
{
    std::string s("Somewhere down the road");
    std::string::size_type prev_pos = 0, pos = 0;

    while( (pos = s.find(' ', pos)) != std::string::npos )
    {
        std::string substring( s.substr(prev_pos, pos-prev_pos) );

        std::cout << substring << '\n';

        prev_pos = ++pos;
    }

    std::string substring( s.substr(prev_pos, pos-prev_pos) ); // Last word
    std::cout << substring << '\n';

    return 0;
}

Solution 16 - C++

If you like to use boost, but want to use a whole string as delimiter (instead of single characters as in most of the previously proposed solutions), you can use the boost_split_iterator.

Example code including convenient template:

#include <iostream>
#include <vector>
#include <boost/algorithm/string.hpp>

template<typename _OutputIterator>
inline void split(
    const std::string& str, 
    const std::string& delim, 
    _OutputIterator result)
{
    using namespace boost::algorithm;
    typedef split_iterator<std::string::const_iterator> It;

    for(It iter=make_split_iterator(str, first_finder(delim, is_equal()));
            iter!=It();
            ++iter)
    {
        *(result++) = boost::copy_range<std::string>(*iter);
    }
}

int main(int argc, char* argv[])
{
    using namespace std;

    vector<string> splitted;
    split("HelloFOOworldFOO!", "FOO", back_inserter(splitted));

    // or directly to console, for example
    split("HelloFOOworldFOO!", "FOO", ostream_iterator<string>(cout, "\n"));
    return 0;
}

Solution 17 - C++

Heres a regex solution that only uses the standard regex library. (I'm a little rusty, so there may be a few syntax errors, but this is at least the general idea)

#include <regex.h>
#include <string.h>
#include <vector.h>

using namespace std;

vector<string> split(string s){
    regex r ("\\w+"); //regex matches whole words, (greedy, so no fragment words)
    regex_iterator<string::iterator> rit ( s.begin(), s.end(), r );
    regex_iterator<string::iterator> rend; //iterators to iterate thru words
    vector<string> result<regex_iterator>(rit, rend);
    return result;  //iterates through the matches to fill the vector
}

Solution 18 - C++

There is a function named strtok.

#include<string>
using namespace std;

vector<string> split(char* str,const char* delim)
{
    char* saveptr;
	char* token = strtok_r(str,delim,&saveptr);

	vector<string> result;

	while(token != NULL)
	{
		result.push_back(token);
		token = strtok_r(NULL,delim,&saveptr);
	}
	return result;
}

Solution 19 - C++

The stringstream can be convenient if you need to parse the string by non-space symbols:

string s = "Name:JAck; Spouse:Susan; ...";
string dummy, name, spouse;

istringstream iss(s);
getline(iss, dummy, ':');
getline(iss, name, ';');
getline(iss, dummy, ':');
getline(iss, spouse, ';')

Solution 20 - C++

Using std::string_view and Eric Niebler's range-v3 library:

https://wandbox.org/permlink/kW5lwRCL1pxjp2pW

#include <iostream>
#include <string>
#include <string_view>
#include "range/v3/view.hpp"
#include "range/v3/algorithm.hpp"

int main() {
	std::string s = "Somewhere down the range v3 library";
	ranges::for_each(s	
        |   ranges::view::split(' ')
        |   ranges::view::transform([](auto &&sub) {
                return std::string_view(&*sub.begin(), ranges::distance(sub));
            }),
		[](auto s) {std::cout << "Substring: " << s << "\n";}
    );
}

By using a range for loop instead of ranges::for_each algorithm:

#include <iostream>
#include <string>
#include <string_view>
#include "range/v3/view.hpp"

int main()
{
    std::string str = "Somewhere down the range v3 library";
    for (auto s : str | ranges::view::split(' ')
                      | ranges::view::transform([](auto&& sub) { return std::string_view(&*sub.begin(), ranges::distance(sub)); }
                      ))
    {
        std::cout << "Substring: " << s << "\n";
    }
}

Solution 21 - C++

C++20 finally blesses us with a split function. Or rather, a range adapter. Godbolt link.

#include <iostream>
#include <ranges>
#include <string_view>

namespace ranges = std::ranges;
namespace views = std::views;

using str = std::string_view;

constexpr auto view =
    "Multiple words"
    | views::split(' ')
    | views::transform([](auto &&r) -> str {
        return {
            &*r.begin(),
            static_cast<str::size_type>(ranges::distance(r))
        };
    });

auto main() -> int {
    for (str &&sv : view) {
        std::cout << sv << '\n';
    }
}

Solution 22 - C++

So far I used the one in Boost, but I needed something that doesn't depends on it, so I came to this:

static void Split(std::vector<std::string>& lst, const std::string& input, const std::string& separators, bool remove_empty = true)
{
    std::ostringstream word;
    for (size_t n = 0; n < input.size(); ++n)
    {
        if (std::string::npos == separators.find(input[n]))
            word << input[n];
        else
        {
            if (!word.str().empty() || !remove_empty)
                lst.push_back(word.str());
            word.str("");
        }
    }
    if (!word.str().empty() || !remove_empty)
        lst.push_back(word.str());
}

A good point is that in separators you can pass more than one character.

Solution 23 - C++

Short and elegant

#include <vector>
#include <string>
using namespace std;

vector<string> split(string data, string token)
{
	vector<string> output;
	size_t pos = string::npos; // size_t to avoid improbable overflow
	do
	{
		pos = data.find(token);
		output.push_back(data.substr(0, pos));
		if (string::npos != pos)
			data = data.substr(pos + token.size());
	} while (string::npos != pos);
	return output;
}

can use any string as delimiter, also can be used with binary data (std::string supports binary data, including nulls)

using:

auto a = split("this!!is!!!example!string", "!!");

output:

this
is
!example!string

Solution 24 - C++

I've rolled my own using strtok and used boost to split a string. The best method I have found is the C++ String Toolkit Library. It is incredibly flexible and fast.

#include <iostream>
#include <vector>
#include <string>
#include <strtk.hpp>

const char *whitespace  = " \t\r\n\f";
const char *whitespace_and_punctuation  = " \t\r\n\f;,=";

int main()
{
    {   // normal parsing of a string into a vector of strings
        std::string s("Somewhere down the road");
        std::vector<std::string> result;
        if( strtk::parse( s, whitespace, result ) )
        {
            for(size_t i = 0; i < result.size(); ++i )
                std::cout << result[i] << std::endl;
        }
    }

    {  // parsing a string into a vector of floats with other separators
        // besides spaces

        std::string s("3.0, 3.14; 4.0");
        std::vector<float> values;
        if( strtk::parse( s, whitespace_and_punctuation, values ) )
        {
            for(size_t i = 0; i < values.size(); ++i )
                std::cout << values[i] << std::endl;
        }
    }

    {  // parsing a string into specific variables

        std::string s("angle = 45; radius = 9.9");
        std::string w1, w2;
        float v1, v2;
        if( strtk::parse( s, whitespace_and_punctuation, w1, v1, w2, v2) )
        {
            std::cout << "word " << w1 << ", value " << v1 << std::endl;
            std::cout << "word " << w2 << ", value " << v2 << std::endl;
        }
    }

    return 0;
}

The toolkit has much more flexibility than this simple example shows but its utility in parsing a string into useful elements is incredible.

Solution 25 - C++

I made this because I needed an easy way to split strings and c-based strings... Hopefully someone else can find it useful as well. Also it doesn't rely on tokens and you can use fields as delimiters, which is another key I needed.

I'm sure there's improvements that can be made to even further improve its elegance and please do by all means

StringSplitter.hpp:

#include <vector>
#include <iostream>
#include <string.h>

using namespace std;

class StringSplit
{
private:
    void copy_fragment(char*, char*, char*);
    void copy_fragment(char*, char*, char);
    bool match_fragment(char*, char*, int);
    int untilnextdelim(char*, char);
    int untilnextdelim(char*, char*);
    void assimilate(char*, char);
    void assimilate(char*, char*);
    bool string_contains(char*, char*);
    long calc_string_size(char*);
    void copy_string(char*, char*);

public:
    vector<char*> split_cstr(char);
    vector<char*> split_cstr(char*);
    vector<string> split_string(char);
    vector<string> split_string(char*);
    char* String;
    bool do_string;
    bool keep_empty;
    vector<char*> Container;
    vector<string> ContainerS;

    StringSplit(char * in)
    {
        String = in;
    }

    StringSplit(string in)
    {
        size_t len = calc_string_size((char*)in.c_str());
        String = new char[len + 1];
        memset(String, 0, len + 1);
        copy_string(String, (char*)in.c_str());
        do_string = true;
    }

    ~StringSplit()
    {
        for (int i = 0; i < Container.size(); i++)
        {
            if (Container[i] != NULL)
            {
                delete[] Container[i];
            }
        }
        if (do_string)
        {
            delete[] String;
        }
    }
};

StringSplitter.cpp:

#include <string.h>
#include <iostream>
#include <vector>
#include "StringSplit.hpp"

using namespace std;

void StringSplit::assimilate(char*src, char delim)
{
    int until = untilnextdelim(src, delim);
    if (until > 0)
    {
        char * temp = new char[until + 1];
        memset(temp, 0, until + 1);
        copy_fragment(temp, src, delim);
        if (keep_empty || *temp != 0)
        {
            if (!do_string)
            {
                Container.push_back(temp);
            }
            else
            {
                string x = temp;
                ContainerS.push_back(x);
            }

        }
        else
        {
            delete[] temp;
        }
    }
}

void StringSplit::assimilate(char*src, char* delim)
{
    int until = untilnextdelim(src, delim);
    if (until > 0)
    {
        char * temp = new char[until + 1];
        memset(temp, 0, until + 1);
        copy_fragment(temp, src, delim);
        if (keep_empty || *temp != 0)
        {
            if (!do_string)
            {
                Container.push_back(temp);
            }
            else
            {
                string x = temp;
                ContainerS.push_back(x);
            }
        }
        else
        {
            delete[] temp;
        }
    }
}

long StringSplit::calc_string_size(char* _in)
{
    long i = 0;
    while (*_in++)
    {
	    i++;
    }
    return i;
}

bool StringSplit::string_contains(char* haystack, char* needle)
{
    size_t len = calc_string_size(needle);
    size_t lenh = calc_string_size(haystack);
    while (lenh--)
    {
        if (match_fragment(haystack + lenh, needle, len))
        {
            return true;
        }
    }
    return false;
}

bool StringSplit::match_fragment(char* _src, char* cmp, int len)
{
	while (len--)
    {
	    if (*(_src + len) != *(cmp + len))
        {
		    return false;
        }
    }
    return true;
}

int StringSplit::untilnextdelim(char* _in, char delim)
{
    size_t len = calc_string_size(_in);
    if (*_in == delim)
    {
	    _in += 1;
	    return len - 1;
    }

    int c = 0;
    while (*(_in + c) != delim && c < len)
    {
		c++;
    }

    return c;
}

int StringSplit::untilnextdelim(char* _in, char* delim)
{
    int s = calc_string_size(delim);
    int c = 1 + s;

    if (!string_contains(_in, delim))
    {
	    return calc_string_size(_in);
    }
    else if (match_fragment(_in, delim, s))
    {
	    _in += s;
	    return calc_string_size(_in);
    }

    while (!match_fragment(_in + c, delim, s))
    {
		c++;
    }

    return c;
}

void StringSplit::copy_fragment(char* dest, char* src, char delim)
{
	if (*src == delim)
    {
		src++;
    }
		
    int c = 0;
    while (*(src + c) != delim && *(src + c))
    {
        *(dest + c) = *(src + c);
		c++;
    }
	*(dest + c) = 0;
}

void StringSplit::copy_string(char* dest, char* src)
{
    int i = 0;
	while (*(src + i))
	{
		*(dest + i) = *(src + i);
		i++;
	}
}

void StringSplit::copy_fragment(char* dest, char* src, char* delim)
{
    size_t len = calc_string_size(delim);
	size_t lens = calc_string_size(src);
	
    if (match_fragment(src, delim, len))
	{
		src += len;
		lens -= len;
	}
	
    int c = 0;
    while (!match_fragment(src + c, delim, len) && (c < lens))
    {
        *(dest + c) = *(src + c);
		c++;
    }
	*(dest + c) = 0;
}

vector<char*> StringSplit::split_cstr(char Delimiter)
{
    int i = 0;
    while (*String)
    {
        if (*String != Delimiter && i == 0)
        {
            assimilate(String, Delimiter);
        }
        if (*String == Delimiter)
        {
		    assimilate(String, Delimiter);
        }
	    i++;
	    String++;
    }

    String -= i;
    delete[] String;

    return Container;
}

vector<string> StringSplit::split_string(char Delimiter)
{
    do_string = true;
    
    int i = 0;
    while (*String)
    {
        if (*String != Delimiter && i == 0)
        {
            assimilate(String, Delimiter);
        }
        if (*String == Delimiter)
        {
		    assimilate(String, Delimiter);
        }
	    i++;
	    String++;
    }

    String -= i;
    delete[] String;

    return ContainerS;
}

vector<char*> StringSplit::split_cstr(char* Delimiter)
{
    int i = 0;
    size_t LenDelim = calc_string_size(Delimiter);

    while(*String)
    {
		if (!match_fragment(String, Delimiter, LenDelim) && i == 0)
        {
            assimilate(String, Delimiter);
        }
        if (match_fragment(String, Delimiter, LenDelim))
        {
			assimilate(String,Delimiter);
        }
	    i++;
	    String++;
    }

    String -= i;
    delete[] String;

    return Container;
}

vector<string> StringSplit::split_string(char* Delimiter)
{
    do_string = true;
    int i = 0;
    size_t LenDelim = calc_string_size(Delimiter);

    while (*String)
    {
		if (!match_fragment(String, Delimiter, LenDelim) && i == 0)
        {
            assimilate(String, Delimiter);
        }
        if (match_fragment(String, Delimiter, LenDelim))
        {
			assimilate(String, Delimiter);
        }
	    i++;
	    String++;
    }

    String -= i;
    delete[] String;

    return ContainerS;
}

Examples:

int main(int argc, char*argv[])
{
    StringSplit ss = "This:CUT:is:CUT:an:CUT:example:CUT:cstring";
    vector<char*> Split = ss.split_cstr(":CUT:");

    for (int i = 0; i < Split.size(); i++)
    {
        cout << Split[i] << endl;
    }

    return 0;
}

Will output:

This
is
an
example
cstring

int main(int argc, char*argv[])
{
    StringSplit ss = "This:is:an:example:cstring";
    vector<char*> Split = ss.split_cstr(':');

    for (int i = 0; i < Split.size(); i++)
    {
        cout << Split[i] << endl;
    }

    return 0;
}

int main(int argc, char*argv[])
{
    string mystring = "This[SPLIT]is[SPLIT]an[SPLIT]example[SPLIT]string";
    StringSplit ss = mystring;
    vector<string> Split = ss.split_string("[SPLIT]");

    for (int i = 0; i < Split.size(); i++)
    {
        cout << Split[i] << endl;
    }

    return 0;
}

int main(int argc, char*argv[])
{
    string mystring = "This|is|an|example|string";
    StringSplit ss = mystring;
    vector<string> Split = ss.split_string('|');

    for (int i = 0; i < Split.size(); i++)
    {
        cout << Split[i] << endl;
    }

    return 0;
}

To keep empty entries (by default empties will be excluded):

StringSplit ss = mystring;
ss.keep_empty = true;
vector<string> Split = ss.split_string(":DELIM:");

The goal was to make it similar to C#'s Split() method where splitting a string is as easy as:

String[] Split = 
    "Hey:cut:what's:cut:your:cut:name?".Split(new[]{":cut:"}, StringSplitOptions.None);

foreach(String X in Split)
{
    Console.Write(X);
}

I hope someone else can find this as useful as I do.

Solution 26 - C++

This answer takes the string and puts it into a vector of strings. It uses the boost library.

#include <boost/algorithm/string.hpp>
std::vector<std::string> strs;
boost::split(strs, "string to split", boost::is_any_of("\t "));

Solution 27 - C++

What about this:

#include <string>
#include <vector>

using namespace std;

vector<string> split(string str, const char delim) {
    vector<string> v;
    string tmp;

    for(string::const_iterator i; i = str.begin(); i <= str.end(); ++i) {
        if(*i != delim && i != str.end()) {
            tmp += *i; 
        } else {
            v.push_back(tmp);
            tmp = ""; 
        }   
    }   

    return v;
}

Solution 28 - C++

Here's another way of doing it..

void split_string(string text,vector<string>& words)
{
  int i=0;
  char ch;
  string word;

  while(ch=text[i++])
  {
    if (isspace(ch))
    {
      if (!word.empty())
      {
        words.push_back(word);
      }
      word = "";
    }
    else
    {
      word += ch;
    }
  }
  if (!word.empty())
  {
    words.push_back(word);
  }
}

Solution 29 - C++

I like to use the boost/regex methods for this task since they provide maximum flexibility for specifying the splitting criteria.

#include <iostream>
#include <string>
#include <boost/regex.hpp>

int main() {
    std::string line("A:::line::to:split");
    const boost::regex re(":+"); // one or more colons

    // -1 means find inverse matches aka split
    boost::sregex_token_iterator tokens(line.begin(),line.end(),re,-1);
    boost::sregex_token_iterator end;

    for (; tokens != end; ++tokens)
        std::cout << *tokens << std::endl;
}

Solution 30 - C++

Recently I had to split a camel-cased word into subwords. There are no delimiters, just upper characters.

#include <string>
#include <list>
#include <locale> // std::isupper

template<class String>
const std::list<String> split_camel_case_string(const String &s)
{
	std::list<String> R;
	String w;

	for (String::const_iterator i = s.begin(); i < s.end(); ++i) {	{
		if (std::isupper(*i)) {
			if (w.length()) {
				R.push_back(w);
				w.clear();
			}
		}
		w += *i;
	}

	if (w.length())
		R.push_back(w);
	return R;
}

For example, this splits "AQueryTrades" into "A", "Query" and "Trades". The function works with narrow and wide strings. Because it respects the current locale it splits "RaumfahrtÜberwachungsVerordnung" into "Raumfahrt", "Überwachungs" and "Verordnung".

Note std::upper should be really passed as function template argument. Then the more generalized from of this function can split at delimiters like ",", ";" or " " too.

Solution 31 - C++

#include<iostream>
#include<string>
#include<sstream>
#include<vector>
using namespace std;

    vector<string> split(const string &s, char delim) {
	    vector<string> elems;
	    stringstream ss(s);
	    string item;
	    while (getline(ss, item, delim)) {
	        elems.push_back(item);
	    }
	    return elems;
    }
    
int main() {
	    
	    vector<string> x = split("thi is an sample test",' ');
		unsigned int i;
		for(i=0;i<x.size();i++)
			cout<<i<<":"<<x[i]<<endl;
	    return 0;
}

Solution 32 - C++

The code below uses strtok() to split a string into tokens and stores the tokens in a vector.

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
 
using namespace std;
 

char one_line_string[] = "hello hi how are you nice weather we are having ok then bye";
char seps[]   = " ,\t\n";
char *token;
 

 
int main()
{
   vector<string> vec_String_Lines;
   token = strtok( one_line_string, seps );
 
   cout << "Extracting and storing data in a vector..\n\n\n";
 
   while( token != NULL )
   {
      vec_String_Lines.push_back(token);
      token = strtok( NULL, seps );
   }
     cout << "Displaying end result in vector line storage..\n\n";
 
    for ( int i = 0; i < vec_String_Lines.size(); ++i)
    cout << vec_String_Lines[i] << "\n";
    cout << "\n\n\n";
 

return 0;
}

Solution 33 - C++

Get Boost ! : -)

#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string.hpp>
#include <iostream>
#include <vector>

using namespace std;
using namespace boost;

int main(int argc, char**argv) {
	typedef vector < string > list_type;

	list_type list;
    string line;

    line = "Somewhere down the road";
    split(list, line, is_any_of(" "));

    for(int i = 0; i < list.size(); i++)
    {
    	cout << list[i] << endl;
    }

    return 0;
}

This example gives the output -

Somewhere
down
the
road

Solution 34 - C++

#include <iostream>
#include <regex>

using namespace std;

int main() {
   string s = "foo bar  baz";
   regex e("\\s+");
   regex_token_iterator<string::iterator> i(s.begin(), s.end(), e, -1);
   regex_token_iterator<string::iterator> end;
   while (i != end)
      cout << " [" << *i++ << "]";
}

IMO, this is the closest thing to python's re.split(). See cplusplus.com for more information about regex_token_iterator. The -1 (4th argument in regex_token_iterator ctor) is the section of the sequence that is not matched, using the match as separator.

Solution 35 - C++

I use this simpleton because we got our String class "special" (i.e. not standard):

void splitString(const String &s, const String &delim, std::vector<String> &result) {
    const int l = delim.length();
    int f = 0;
    int i = s.indexOf(delim,f);
    while (i>=0) {
        String token( i-f > 0 ? s.substring(f,i-f) : "");
        result.push_back(token);
        f=i+l;
        i = s.indexOf(delim,f);
    }
    String token = s.substring(f);
    result.push_back(token);
}

Solution 36 - C++

Everyone answered for predefined string input. I think this answer will help someone for scanned input.

I used tokens vector for holding string tokens. It's optional.

#include <bits/stdc++.h>

using namespace std ;
int main()
{
    string str, token ;
    getline(cin, str) ; // get the string as input
    istringstream ss(str); // insert the string into tokenizer

    vector<string> tokens; // vector tokens holds the tokens

    while (ss >> token) tokens.push_back(token); // splits the tokens
    for(auto x : tokens) cout << x << endl ; // prints the tokens

    return 0;
}


sample input:

port city international university

sample output:

port
city
international
university

Note that by default this will work for only space as the delimiter. you can use custom delimiter. For that, you have customized the code. let the delimiter be ','. so use

char delimiter = ',' ;
while(getline(ss, token, delimiter)) tokens.push_back(token) ;

instead of

while (ss >> token) tokens.push_back(token);

Solution 37 - C++

The following is a much better way to do this. It can take any character, and doesn't split lines unless you want. No special libraries needed (well, besides std, but who really considers that an extra library), no pointers, no references, and it's static. Just simple plain C++.

#pragma once
#include <vector>
#include <sstream>
using namespace std;
class Helpers
{
    public:
        static vector<string> split(string s, char delim)
        {
            stringstream temp (stringstream::in | stringstream::out);
            vector<string> elems(0);
            if (s.size() == 0 || delim == 0)
                return elems;
            for(char c : s)
            {
                if(c == delim)
                {
                    elems.push_back(temp.str());
                    temp = stringstream(stringstream::in | stringstream::out);
                }
                else
                    temp << c;
            }
            if (temp.str().size() > 0)
                elems.push_back(temp.str());
                return elems;
            }

        //Splits string s with a list of delimiters in delims (it's just a list, like if we wanted to
        //split at the following letters, a, b, c we would make delims="abc".
        static vector<string> split(string s, string delims)
        {
            stringstream temp (stringstream::in | stringstream::out);
            vector<string> elems(0);
            bool found;
            if(s.size() == 0 || delims.size() == 0)
                return elems;
            for(char c : s)
            {
                found = false;
                for(char d : delims)
                {
                    if (c == d)
                    {
                        elems.push_back(temp.str());
                        temp = stringstream(stringstream::in | stringstream::out);
                        found = true;
                        break;
                    }
                }
                if(!found)
                    temp << c;
            }
            if(temp.str().size() > 0)
                elems.push_back(temp.str());
            return elems;
        }
};

Solution 38 - C++

I wrote the following piece of code. You can specify delimiter, which can be a string. The result is similar to Java's String.split, with empty string in the result.

For example, if we call split("ABCPICKABCANYABCTWO:ABC", "ABC"), the result is as follows:

0  <len:0>
1 PICK <len:4>
2 ANY <len:3>
3 TWO: <len:4>
4  <len:0>

Code:

vector <string> split(const string& str, const string& delimiter = " ") {
    vector <string> tokens;
    
    string::size_type lastPos = 0;
    string::size_type pos = str.find(delimiter, lastPos);
    
    while (string::npos != pos) {
        // Found a token, add it to the vector.
        cout << str.substr(lastPos, pos - lastPos) << endl;
        tokens.push_back(str.substr(lastPos, pos - lastPos));
        lastPos = pos + delimiter.size();
        pos = str.find(delimiter, lastPos);
    }
    
    tokens.push_back(str.substr(lastPos, str.size() - lastPos));
    return tokens;
}

Solution 39 - C++

Here is my solution using C++11 and the STL. It should be reasonably efficient:

#include <vector>
#include <string>
#include <cstring>
#include <iostream>
#include <algorithm>
#include <functional>

std::vector<std::string> split(const std::string& s)
{
    std::vector<std::string> v;

    const auto end = s.end();
    auto to = s.begin();
    decltype(to) from;

    while((from = std::find_if(to, end,
    	[](char c){ return !std::isspace(c); })) != end)
    {
        to = std::find_if(from, end, [](char c){ return std::isspace(c); });
        v.emplace_back(from, to);
    }

    return v;
}

int main()
{
    std::string s = "this is the string  to  split";

    auto v = split(s);

    for(auto&& s: v)
        std::cout << s << '\n';
}

Output:

this
is
the
string
to
split

Solution 40 - C++

When dealing with whitespace as separator, the obvious answer of using std::istream_iterator<T> is already given and voted up a lot. Of course, elements may not be separated by whitespace but by some separator instead. I didn't spot any answer which just redefines the meaning of whitespace to be said separator and then uses the conventional approach.

The way to change what streams consider whitespace, you'd simply change the stream's std::locale using (std::istream::imbue()) with a std::ctype<char> facet with its own definition of what whitespace means (it can be done for std::ctype<wchar_t>, too, but its is actually slightly different because std::ctype<char> is table-driven while std::ctype<wchar_t> is driven by virtual functions).

#include <iostream>
#include <algorithm>
#include <iterator>
#include <sstream>
#include <locale>

struct whitespace_mask {
    std::ctype_base::mask mask_table[std::ctype<char>::table_size];
    whitespace_mask(std::string const& spaces) {
        std::ctype_base::mask* table = this->mask_table;
        std::ctype_base::mask const* tab
            = std::use_facet<std::ctype<char>>(std::locale()).table();
        for (std::size_t i(0); i != std::ctype<char>::table_size; ++i) {
            table[i] = tab[i] & ~std::ctype_base::space;
        }
        std::for_each(spaces.begin(), spaces.end(), [=](unsigned char c) {
            table[c] |= std::ctype_base::space;
        });
    }
};
class whitespace_facet
    : private whitespace_mask
    , public std::ctype<char> {
public:
    whitespace_facet(std::string const& spaces)
        : whitespace_mask(spaces)
        , std::ctype<char>(this->mask_table) {
    }
};

struct whitespace {
    std::string spaces;
    whitespace(std::string const& spaces): spaces(spaces) {}
};
std::istream& operator>>(std::istream& in, whitespace const& ws) {
    std::locale loc(in.getloc(), new whitespace_facet(ws.spaces));
    in.imbue(loc);
    return in;
}
// everything above would probably go into a utility library...

int main() {
    std::istringstream in("a, b, c, d, e");
    std::copy(std::istream_iterator<std::string>(in >> whitespace(", ")),
              std::istream_iterator<std::string>(),
              std::ostream_iterator<std::string>(std::cout, "\n"));

    std::istringstream pipes("a b c|  d |e     e");
    std::copy(std::istream_iterator<std::string>(pipes >> whitespace("|")),
              std::istream_iterator<std::string>(),
              std::ostream_iterator<std::string>(std::cout, "\n"));   
}

Most of the code is for packaging up a general purpose tool providing soft delimiters: multiple delimiters in a row are merged. There is no way to produce an empty sequence. When different delimiters are needed within a stream, you'd probably use differently set up streams using a shared stream buffer:

void f(std::istream& in) {
    std::istream pipes(in.rdbuf());
    pipes >> whitespace("|");
    std::istream comma(in.rdbuf());
    comma >> whitespace(",");

    std::string s0, s1;
    if (pipes >> s0 >> std::ws   // read up to first pipe and ignore sequence of pipes
        && comma >> s1 >> std::ws) { // read up to first comma and ignore commas
        // ...
    }
}

Solution 41 - C++

As a hobbyist, this is the first solution that came to my mind. I'm kind of curious why I haven't seen a similar solution here yet, is there something fundamentally wrong with how I did it?

#include <iostream>
#include <string>
#include <vector>

std::vector<std::string> split(const std::string &s, const std::string &delims)
{
    std::vector<std::string> result;
    std::string::size_type pos = 0;
    while (std::string::npos != (pos = s.find_first_not_of(delims, pos))) {
        auto pos2 = s.find_first_of(delims, pos);
        result.emplace_back(s.substr(pos, std::string::npos == pos2 ? pos2 : pos2 - pos));
        pos = pos2;
    }
    return result;
}

int main()
{
    std::string text{"And then I said: \"I don't get it, why would you even do that!?\""};
    std::string delims{" :;\".,?!"};
    auto words = split(text, delims);
    std::cout << "\nSentence:\n  " << text << "\n\nWords:";
    for (const auto &w : words) {
        std::cout << "\n  " << w;
    }
    return 0;
}

http://cpp.sh/7wmzy

Solution 42 - C++

I cannot believe how overly complicated most of these answers were. Why didnt someone suggest something as simple as this?

#include <iostream>
#include <sstream>

std::string input = "This is a sentence to read";
std::istringstream ss(input);
std::string token;

while(std::getline(ss, token, ' ')) {
    std::cout << token << endl;
}

Solution 43 - C++

This is my versión taken the source of Kev:

#include <string>
#include <vector>
void split(vector<string> &result, string str, char delim ) {
  string tmp;
  string::iterator i;
  result.clear();

  for(i = str.begin(); i <= str.end(); ++i) {
    if((const char)*i != delim  && i != str.end()) {
      tmp += *i;
    } else {
      result.push_back(tmp);
      tmp = "";
    }
  }
}

After, call the function and do something with it:

vector<string> hosts;
split(hosts, "192.168.1.2,192.168.1.3", ',');
for( size_t i = 0; i < hosts.size(); i++){
  cout <<  "Connecting host : " << hosts.at(i) << "..." << endl;
}

Solution 44 - C++

Although there was some answer providing C++20 solution, since it was posted there were some changes made and applied to C++20 as Defect Reports. Because of that the solution is a little bit shorter and nicer:

#include <iostream>
#include <ranges>
#include <string_view>

namespace views = std::views;
using str = std::string_view;

constexpr str text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";

auto splitByWords(str input) {
    return input
    | views::split(' ')
    | views::transform([](auto &&r) -> str {
        return {r.begin(), r.end()};
    });
}

auto main() -> int {
    for (str &&word : splitByWords(text)) {
        std::cout << word << '\n';
    }
}

As of today it is still available only on the trunk branch of GCC (Godbolt link). It is based on two changes: P1391 iterator constructor for std::string_view and P2210 DR fixing std::views::split to preserve range type.

In C++23 there won't be any transform boilerplate needed, since P1989 adds a range constructor to std::string_view:

#include <iostream>
#include <ranges>
#include <string_view>

namespace views = std::views;

constexpr std::string_view text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.";

auto main() -> int {
    for (std::string_view&& word : text | views::split(' ')) {
        std::cout << word << '\n';
    }
}

(Godbolt link)

Solution 45 - C++

I use the following code:

namespace Core
{
    typedef std::wstring String;

    void SplitString(const Core::String& input, const Core::String& splitter, std::list<Core::String>& output)
    {
        if (splitter.empty())
        {
            throw std::invalid_argument(); // for example
        }

        std::list<Core::String> lines;

        Core::String::size_type offset = 0;

        for (;;)
        {
            Core::String::size_type splitterPos = input.find(splitter, offset);

            if (splitterPos != Core::String::npos)
            {
                lines.push_back(input.substr(offset, splitterPos - offset));
                offset = splitterPos + splitter.size();
            }
            else
            {
                lines.push_back(input.substr(offset));
                break;
            }
        }

        lines.swap(output);
    }
}

// gtest:

class SplitStringTest: public testing::Test
{
};

TEST_F(SplitStringTest, EmptyStringAndSplitter)
{
    std::list<Core::String> result;
    ASSERT_ANY_THROW(Core::SplitString(Core::String(), Core::String(), result));
}

TEST_F(SplitStringTest, NonEmptyStringAndEmptySplitter)
{
    std::list<Core::String> result;
    ASSERT_ANY_THROW(Core::SplitString(L"xy", Core::String(), result));
}

TEST_F(SplitStringTest, EmptyStringAndNonEmptySplitter)
{
    std::list<Core::String> result;
    Core::SplitString(Core::String(), Core::String(L","), result);
    ASSERT_EQ(1, result.size());
    ASSERT_EQ(Core::String(), *result.begin());
}

TEST_F(SplitStringTest, OneCharSplitter)
{
    std::list<Core::String> result;

    Core::SplitString(L"x,y", L",", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(L"x", *result.begin());
    ASSERT_EQ(L"y", *result.rbegin());

    Core::SplitString(L",xy", L",", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(Core::String(), *result.begin());
    ASSERT_EQ(L"xy", *result.rbegin());

    Core::SplitString(L"xy,", L",", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(L"xy", *result.begin());
    ASSERT_EQ(Core::String(), *result.rbegin());
}

TEST_F(SplitStringTest, TwoCharsSplitter)
{
    std::list<Core::String> result;

    Core::SplitString(L"x,.y,z", L",.", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(L"x", *result.begin());
    ASSERT_EQ(L"y,z", *result.rbegin());

    Core::SplitString(L"x,,y,z", L",,", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(L"x", *result.begin());
    ASSERT_EQ(L"y,z", *result.rbegin());
}

TEST_F(SplitStringTest, RecursiveSplitter)
{
    std::list<Core::String> result;

    Core::SplitString(L",,,", L",,", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(Core::String(), *result.begin());
    ASSERT_EQ(L",", *result.rbegin());

    Core::SplitString(L",.,.,", L",.,", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(Core::String(), *result.begin());
    ASSERT_EQ(L".,", *result.rbegin());

    Core::SplitString(L"x,.,.,y", L",.,", result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(L"x", *result.begin());
    ASSERT_EQ(L".,y", *result.rbegin());

    Core::SplitString(L",.,,.,", L",.,", result);
    ASSERT_EQ(3, result.size());
    ASSERT_EQ(Core::String(), *result.begin());
    ASSERT_EQ(Core::String(), *(++result.begin()));
    ASSERT_EQ(Core::String(), *result.rbegin());
}

TEST_F(SplitStringTest, NullTerminators)
{
    std::list<Core::String> result;

    Core::SplitString(L"xy", Core::String(L"\0", 1), result);
    ASSERT_EQ(1, result.size());
    ASSERT_EQ(L"xy", *result.begin());

    Core::SplitString(Core::String(L"x\0y", 3), Core::String(L"\0", 1), result);
    ASSERT_EQ(2, result.size());
    ASSERT_EQ(L"x", *result.begin());
    ASSERT_EQ(L"y", *result.rbegin());
}

Solution 46 - C++

We can use strtok in c++ ,

#include <iostream>
#include <cstring>
using namespace std;

int main()
{
    char str[]="Mickey M;12034;911416313;M;01a;9001;NULL;0;13;12;0;CPP,C;MSC,3D;FEND,BEND,SEC;";
    char *pch = strtok (str,";,");
    while (pch != NULL)
    {
        cout<<pch<<"\n";
        pch = strtok (NULL, ";,");
    }
    return 0;
}

Solution 47 - C++

This is my solution to this problem:

vector<string> get_tokens(string str) {
    vector<string> dt;
    stringstream ss;
    string tmp; 
    ss << str;
    for (size_t i; !ss.eof(); ++i) {
        ss >> tmp;
        dt.push_back(tmp);
    }
    return dt;
}

This function returns a vector of strings.

Solution 48 - C++

Based on Galik's answer I made this. This is mostly here so I don't have to keep writing it again and again. It's crazy that C++ still doesn't have a native split function. Features:

  • Should be very fast.
  • Easy to understand (I think).
  • Merges empty sections.
  • Trivial to use several delimiters (e.g. "\r\n")

#include <string>
#include <vector>
#include <algorithm>

std::vector<std::string> split(const std::string& s, const std::string& delims)
{
	using namespace std;

	vector<string> v;

	// Start of an element.
	size_t elemStart = 0;

	// We start searching from the end of the previous element, which
	// initially is the start of the string.
	size_t elemEnd = 0;

	// Find the first non-delim, i.e. the start of an element, after the end of the previous element.
	while((elemStart = s.find_first_not_of(delims, elemEnd)) != string::npos)
	{
		// Find the first delem, i.e. the end of the element (or if this fails it is the end of the string).
		elemEnd = s.find_first_of(delims, elemStart);
		// Add it.
		v.emplace_back(s, elemStart, elemEnd == string::npos ? string::npos : elemEnd - elemStart);
	}
	// When there are no more non-spaces, we are done.

	return v;
}

Solution 49 - C++

Quick version which uses vector as the base class, giving full access to all of its operators:

    // Split string into parts.
    class Split : public std::vector<std::string>
    {
        public:
            Split(const std::string& str, char* delimList)
            {
               size_t lastPos = 0;
               size_t pos = str.find_first_of(delimList);

               while (pos != std::string::npos)
               {
                    if (pos != lastPos)
                        push_back(str.substr(lastPos, pos-lastPos));
                    lastPos = pos + 1;
                    pos = str.find_first_of(delimList, lastPos);
               }
               if (lastPos < str.length())
                   push_back(str.substr(lastPos, pos-lastPos));
            }
    };

Example used to populate an STL set:

std::set<std::string> words;
Split split("Hello,World", ",");
words.insert(split.begin(), split.end());

Solution 50 - C++

I use the following

void split(string in, vector<string>& parts, char separator) {
	string::iterator  ts, curr;
	ts = curr = in.begin();
	for(; curr <= in.end(); curr++ ) {
		if( (curr == in.end() || *curr == separator) && curr > ts )
               parts.push_back( string( ts, curr ));
    	if( curr == in.end() )
               break;
		if( *curr == separator ) ts = curr + 1; 
	}
}

PlasmaHH, I forgot to include the extra check( curr > ts) for removing tokens with whitespace.

Solution 51 - C++

This is a function I wrote that helps me do a lot. It helped me when doing protocol for WebSockets.

using namespace std;
#include <iostream>
#include <vector>
#include <sstream>
#include <string>

vector<string> split ( string input , string split_id ) {
  vector<string> result;
  int i = 0;
  bool add;
  string temp;
  stringstream ss;
  size_t found;
  string real;
  int r = 0;
	while ( i != input.length() ) {
		add = false;
		ss << input.at(i);
		temp = ss.str();
		found = temp.find(split_id);
		if ( found != string::npos ) {
			add = true;
			real.append ( temp , 0 , found );
		} else if ( r > 0 &&  ( i+1 ) == input.length() ) {
			add = true;
			real.append ( temp , 0 , found );
		}
		if ( add ) {
			result.push_back(real);
			ss.str(string());
			ss.clear();
			temp.clear();
			real.clear();
			r = 0;
		}
		i++;
		r++;
	}
  return result;
}

int main() {
	string s = "S,o,m,e,w,h,e,r,e, down the road \n In a really big C++ house.  \n  Lives a little old lady.   \n   That no one ever knew.    \n    She comes outside.     \n     In the very hot sun.      \n\n\n\n\n\n\n\n   And throws C++ at us.    \n    The End.  FIN.";
	vector < string > Token;
	Token = split ( s , "," );
	for ( int i = 0 ; i < Token.size(); i++)	cout << Token.at(i) << endl;
	cout << endl << Token.size();
	int a;
	cin >> a;
	return a;
}

Solution 52 - C++

LazyStringSplitter:

#include <string>
#include <algorithm>
#include <unordered_set>

using namespace std;

class LazyStringSplitter
{
    string::const_iterator start, finish;
    unordered_set<char> chop;

public:

    // Empty Constructor
    explicit LazyStringSplitter()
    {}

    explicit LazyStringSplitter (const string cstr, const string delims)
	    : start(cstr.begin())
	    , finish(cstr.end())
	    , chop(delims.begin(), delims.end())
    {}

    void operator () (const string cstr, const string delims)
    {
	    chop.insert(delims.begin(), delims.end());
	    start = cstr.begin();
	    finish = cstr.end();
    }

    bool empty() const { return (start >= finish); }

    string next()
    {
	    // return empty string
	    // if ran out of characters
	    if (empty())
	        return string("");

	    auto runner = find_if(start, finish, [&](char c) {
	        return chop.count(c) == 1;
	    });

	    // construct next string
	    string ret(start, runner);
	    start = runner + 1;

	    // Never return empty string
	    // + tail recursion makes this method efficient
	    return !ret.empty() ? ret : next();
    }
};
  • I call this method the LazyStringSplitter because of one reason - It does not split the string in one go.
  • In essence it behaves like a python generator
  • It exposes a method called next which returns the next string that is split from the original
  • I made use of the unordered_set from c++11 STL, so that look up of delimiters is that much faster
  • And here is how it works

TEST PROGRAM

#include <iostream>
using namespace std;

int main()
{
    LazyStringSplitter splitter;

    // split at the characters ' ', '!', '.', ','
    splitter("This, is a string. And here is another string! Let's test and see how well this does.", " !.,");

    while (!splitter.empty())
	    cout << splitter.next() << endl;
    return 0;
}

OUTPUT

This
is
a
string
And
here
is
another
string
Let's
test
and
see
how
well
this
does

Next plan to improve this is to implement begin and end methods so that one can do something like:

vector<string> split_string(splitter.begin(), splitter.end());

Solution 53 - C++

I've been searching for a way to split a string by a separator of any length, so I started writing it from scratch, as existing solutions didn't suit me.

Here is my little algorithm, using only STL:

//use like this
//std::vector<std::wstring> vec = Split<std::wstring> (L"Hello##world##!", L"##");

template <typename valueType>
static std::vector <valueType> Split (valueType text, const valueType& delimiter)
{
	std::vector <valueType> tokens;
	size_t pos = 0;
	valueType token;

	while ((pos = text.find(delimiter)) != valueType::npos) 
	{
		token = text.substr(0, pos);
		tokens.push_back (token);
		text.erase(0, pos + delimiter.length());
	}
	tokens.push_back (text);

	return tokens;
}

It can be used with separator of any length and form, as far as I've tested. Instantiate with either string or wstring type.

All the algorithm does is it searches for the delimiter, gets the part of the string that is up to the delimiter, deletes the delimiter and searches again until it finds it no more.

Of course, you can use any number of whitespaces for the delimiter.

I hope it helps.

Solution 54 - C++

No Boost, no string streams, just the standard C library cooperating together with std::string and std::list: C library functions for easy analysis, C++ data types for easy memory management.

Whitespace is considered to be any combination of newlines, tabs and spaces. The set of whitespace characters is established by the wschars variable.

#include <string>
#include <list>
#include <iostream>
#include <cstring>

using namespace std;

const char *wschars = "\t\n ";

list<string> split(const string &str)
{
  const char *cstr = str.c_str();
  list<string> out;

  while (*cstr) {                     // while remaining string not empty
    size_t toklen;
    cstr += strspn(cstr, wschars);    // skip leading whitespace
    toklen = strcspn(cstr, wschars);  // figure out token length
    if (toklen)                       // if we have a token, add to list
      out.push_back(string(cstr, toklen));
    cstr += toklen;                   // skip over token
  }

  // ran out of string; return list

  return out;
}

int main(int argc, char **argv)
{
  list<string> li = split(argv[1]);
  for (list<string>::iterator i = li.begin(); i != li.end(); i++)
    cout << "{" << *i << "}" << endl;
  return 0;
}

Run:

$ ./split ""
$ ./split "a"
{a}
$ ./split " a "
{a}
$ ./split " a b"
{a}
{b}
$ ./split " a b c"
{a}
{b}
{c}
$ ./split " a b c d  "
{a}
{b}
{c}
{d}

Tail-recursive version of split (itself split into two functions). All destructive manipulation of variables is gone, except for the pushing of strings into the list!

void split_rec(const char *cstr, list<string> &li)
{
  if (*cstr) {
    const size_t leadsp = strspn(cstr, wschars);
    const size_t toklen = strcspn(cstr + leadsp, wschars);

    if (toklen)
      li.push_back(string(cstr + leadsp, toklen));

    split_rec(cstr + leadsp + toklen, li);
  }
}

list<string> split(const string &str)
{
  list<string> out;
  split_rec(str.c_str(), out);
  return out;
}

Solution 55 - C++

Here is my version

#include <vector>

inline std::vector<std::string> Split(const std::string &str, const std::string &delim = " ")
{
    std::vector<std::string> tokens;
    if (str.size() > 0)
    {
        if (delim.size() > 0)
        {
            std::string::size_type currPos = 0, prevPos = 0;
            while ((currPos = str.find(delim, prevPos)) != std::string::npos)
            {
                std::string item = str.substr(prevPos, currPos - prevPos);
                if (item.size() > 0)
                {
                    tokens.push_back(item);
                }
                prevPos = currPos + 1;
            }
            tokens.push_back(str.substr(prevPos));
        }
        else
        {
            tokens.push_back(str);
        }
    }
    return tokens;
}

It works with multi-character delimiters. It prevents empty tokens to get in your results. It uses a single header. It returns the string as one single token when you provide no delimiter. It also returns an empty result if the string is empty. It is unfortunately inefficient because of the huge std::vector copy UNLESS you are compiling using C++11, which should be using the move schematic. In C++11, this code should be fast.

Solution 56 - C++

Here's my entry:

template <typename Container, typename InputIter, typename ForwardIter>
Container
split(InputIter first, InputIter last,
      ForwardIter s_first, ForwardIter s_last)
{
    Container output;

    while (true) {
        auto pos = std::find_first_of(first, last, s_first, s_last);
        output.emplace_back(first, pos);
        if (pos == last) {
            break;
        }

        first = ++pos;
    }

    return output;
}

template <typename Output = std::vector<std::string>,
          typename Input = std::string,
          typename Delims = std::string>
Output
split(const Input& input, const Delims& delims = " ")
{
    using std::cbegin;
    using std::cend;
    return split<Output>(cbegin(input), cend(input),
                         cbegin(delims), cend(delims));
}

auto vec = split("Mary had a little lamb");

The first definition is an STL-style generic function taking two pair of iterators. The second is a convenience function to save you having to do all the begin()s and end()s yourself. You can also specify the output container type as a template parameter if you wanted to use a list, for example.

What makes it elegant (IMO) is that unlike most of the other answers, it's not restricted to strings but will work with any STL-compatible container. Without any change to the code above, you can say:

using vec_of_vecs_t = std::vector<std::vector<int>>;

std::vector<int> v{1, 2, 0, 3, 4, 5, 0, 7, 8, 0, 9};
auto r = split<vec_of_vecs_t>(v, std::initializer_list<int>{0, 2});

which will split the vector v into separate vectors every time a 0 or a 2 is encountered.

(There's also the added bonus that with strings, this implementation is faster than both strtok()- and getline()-based versions, at least on my system.)

Solution 57 - C++

For those who need alternative in splitting string with a string delimiter, perhaps you can try my following solution.

std::vector<size_t> str_pos(const std::string &search, const std::string &target)
{
    std::vector<size_t> founds;

    if(!search.empty())
    {
        size_t start_pos = 0;
    
        while (true)
        {
            size_t found_pos = target.find(search, start_pos);
        
            if(found_pos != std::string::npos)
            {
                size_t found = found_pos;
            
                founds.push_back(found);
            
                start_pos = (found_pos + 1);
            }
            else
            {
                break;
            }
        }
    }

    return founds;
}

std::string str_sub_index(size_t begin_index, size_t end_index, const std::string &target)
{
    std::string sub;

    size_t size = target.length();

    const char* copy = target.c_str();

    for(size_t i = begin_index; i <= end_index; i++)
    {
        if(i >= size)
        {
            break;
        }
        else
        {
            char c = copy[i];
        
            sub += c;
        }
    }

    return sub;
}

std::vector<std::string> str_split(const std::string &delimiter, const std::string &target)
{
    std::vector<std::string> splits;

    if(!delimiter.empty())
    {
        std::vector<size_t> founds = str_pos(delimiter, target);
    
        size_t founds_size = founds.size();
    
        if(founds_size > 0)
        {
            size_t search_len = delimiter.length();
        
            size_t begin_index = 0;
        
            for(int i = 0; i <= founds_size; i++)
            {
                std::string sub;
            
                if(i != founds_size)
                {
                    size_t pos  = founds.at(i);
                
                    sub = str_sub_index(begin_index, pos - 1, target);
                
                    begin_index = (pos + search_len);
                }
                else
                {
                    sub = str_sub_index(begin_index, (target.length() - 1), target);
                }
            
                splits.push_back(sub);
            }
        }
    }

    return splits;
}

Those snippets consist of 3 function. The bad news is to use the str_split function you will need the other two functions. Yes it is a huge chunk of code. But the good news is that those additional two functions are able to work independently and sometimes can be useful too.. :)

Tested the function in main() block like this:

int main()
{
	std::string s = "Hello, world! We need to make the world a better place. Because your world is also my world, and our children's world.";

    std::vector<std::string> split = str_split("world", s);

    for(int i = 0; i < split.size(); i++)
    {
        std::cout << split[i] << std::endl;
    }
}

And it would produce:

Hello, 
! We need to make the 
 a better place. Because your 
 is also my 
, and our children's 
.

I believe that's not the most efficient code, but at least it works. Hope it helps.

Solution 58 - C++

Here's my take on this. I had to process the input string word by word, which could have been done by using space to count words but I felt it would be tedious and I should split the words into vectors.

#include<iostream>
#include<vector>
#include<string>
#include<stdio.h>
using namespace std;
int main()
{
    char x = '\0';
    string s = "";
    vector<string> q;
    x = getchar();
    while(x != '\n')
    {
        if(x == ' ')
        {
            q.push_back(s);
            s = "";
            x = getchar();
            continue;
        }
        s = s + x;
        x = getchar();
    }
    q.push_back(s);
    for(int i = 0; i<q.size(); i++)
        cout<<q[i]<<" ";
    return 0;
}
  1. Doesn't take care of multiple spaces.
  2. If the last word is not immediately followed by newline character, it includes the whitespace between the last word's last character and newline character.

Solution 59 - C++

Yes, I looked through all 30 examples.

I couldn't find a version of split that works for multi-char delimiters, so here's mine:

#include <string>
#include <vector>

using namespace std;

vector<string> split(const string &str, const string &delim)
{   
    const auto delim_pos = str.find(delim);
    
    if (delim_pos == string::npos)
        return {str};
    
    vector<string> ret{str.substr(0, delim_pos)};
    auto tail = split(str.substr(delim_pos + delim.size(), string::npos), delim);
        
    ret.insert(ret.end(), tail.begin(), tail.end());
    
    return ret;
}

Probably not the most efficient of implementations, but it's a very straightforward recursive solution, using only <string> and <vector>.

Ah, it's written in C++11, but there's nothing special about this code, so you could easily adapt it to C++98.

Solution 60 - C++

Loop on getline with ' ' as the token.

Solution 61 - C++

I believe no one has posted this solution yet. Instead of using delimiters directly, it basically does the same as boost::split(), i.e., it allows you to pass a predicate that returns true if a char is a delimiter, and false otherwise. I think this gives the programmer a lot more control, and the great thing is you don't need boost.

template <class Container, class String, class Predicate>
void split(Container& output, const String& input,
           const Predicate& pred, bool trimEmpty = false) {
    auto it = begin(input);
    auto itLast = it;
    while (it = find_if(it, end(input), pred), it != end(input)) {
        if (not (trimEmpty and it == itLast)) {
            output.emplace_back(itLast, it);
        }
        ++it;
        itLast = it;
    }
}

Then you can use it like this:

struct Delim {
    bool operator()(char c) {
        return not isalpha(c);
    }
};    

int main() {
    string s("#include<iostream>\n"
             "int main() { std::cout << \"Hello world!\" << std::endl; }");

    vector<string> v;

    split(v, s, Delim(), true);
    /* Which is also the same as */
    split(v, s, [](char c) { return not isalpha(c); }, true);

    for (const auto& i : v) {
        cout << i << endl;
    }
}

Solution 62 - C++

I have just written a fine example of how to split a char by symbol, which then places each array of chars (words seperated by your symbol) into a vector. For simplicity i made the vector type of std string.

I hope this helps and is readable to you.

#include <vector>
#include <string>
#include <iostream>

void push(std::vector<std::string> &WORDS, std::string &TMP){
	WORDS.push_back(TMP);
	TMP = "";
}
std::vector<std::string> mySplit(char STRING[]){
		std::vector<std::string> words;
		std::string s;
		for(unsigned short i = 0; i < strlen(STRING); i++){
			if(STRING[i] != ' '){
				s += STRING[i];
			}else{
				push(words, s);
			}
		}
		push(words, s);//Used to get last split
		return words;
}

int main(){
	char string[] = "My awesome string.";
	std::cout << mySplit(string)[2];
    std::cin.get();
    return 0;
}

Solution 63 - C++

// adapted from a "regular" csv parse
std::string stringIn = "my csv  is 10233478 NOTseparated by commas";
std::vector<std::string> commaSeparated(1);
int commaCounter = 0;
for (int i=0; i<stringIn.size(); i++) {
	if (stringIn[i] == " ") {
		commaSeparated.push_back("");
		commaCounter++;
	} else {
		commaSeparated.at(commaCounter) += stringIn[i];
	}
}

in the end you will have a vector of strings with every element in the sentence separated by spaces. only non-standard resource is std::vector (but since an std::string is involved, i figured it would be acceptable).

empty strings are saved as a separate items.

Solution 64 - C++

#include <iostream>
#include <vector>
using namespace std;

int main() {
  string str = "ABC AABCD CDDD RABC GHTTYU FR";
  str += " "; //dirty hack: adding extra space to the end
  vector<string> v;

  for (int i=0; i<(int)str.size(); i++) {
    int a, b;
    a = i;
    
    for (int j=i; j<(int)str.size(); j++) {
      if (str[j] == ' ') {
        b = j;
        i = j;
        break;
      }
    }
    v.push_back(str.substr(a, b-a));
  }

  for (int i=0; i<v.size(); i++) {
    cout<<v[i].size()<<" "<<v[i]<<endl;
  }
  return 0;
}

Solution 65 - C++

Just for convenience:

template<class V, typename T>
bool in(const V &v, const T &el) {
    return std::find(v.begin(), v.end(), el) != v.end();
}

The actual splitting based on multiple delimiters:

std::vector<std::string> split(const std::string &s,
                               const std::vector<char> &delims) {
    std::vector<std::string> res;
    auto stuff = [&delims](char c) { return !in(delims, c); };
    auto space = [&delims](char c) { return in(delims, c); };
    auto first = std::find_if(s.begin(), s.end(), stuff);
    while (first != s.end()) {
        auto last = std::find_if(first, s.end(), space);
        res.push_back(std::string(first, last));
        first = std::find_if(last + 1, s.end(), stuff);
    }
    return res;
}

The usage:

int main() {
    std::string s = "   aaa,  bb  cc ";
    for (auto el: split(s, {' ', ','}))
        std::cout << el << std::endl;
    return 0;
}

Solution 66 - C++

I have a very different approach from the other solutions that offers a lot of value in ways that the other solutions are variously lacking, but of course also has its own down sides. Here is the working implementation, with the example of putting <tag></tag> around words.

For a start, this problem can be solved with one loop, no additional memory, and by considering merely four logical cases. Conceptually, we're interested in boundaries. Our code should reflect that: let's iterate through the string and look at two characters at a time, bearing in mind that we have special cases at the start and end of the string.

The downside is that we have to write the implementation, which is somewhat verbose, but mostly convenient boilerplate.

The upside is that we wrote the implementation, so it is very easy to customize it to specific needs, such as distinguishing left and write word boundaries, using any set of delimiters, or handling other cases such as non-boundary or erroneous positions.

using namespace std;

#include <iostream>
#include <string>

#include <cctype>

typedef enum boundary_type_e {
    E_BOUNDARY_TYPE_ERROR = -1,
    E_BOUNDARY_TYPE_NONE,
    E_BOUNDARY_TYPE_LEFT,
    E_BOUNDARY_TYPE_RIGHT,
} boundary_type_t;

typedef struct boundary_s {
    boundary_type_t type;
    int pos;
} boundary_t;

bool is_delim_char(int c) {
    return isspace(c); // also compare against any other chars you want to use as delimiters
}

bool is_word_char(int c) {
    return ' ' <= c && c <= '~' && !is_delim_char(c);
}

boundary_t maybe_word_boundary(string str, int pos) {
    int len = str.length();
    if (pos < 0 || pos >= len) {
        return (boundary_t){.type = E_BOUNDARY_TYPE_ERROR};
    } else {
        if (pos == 0 && is_word_char(str[pos])) {
            // if the first character is word-y, we have a left boundary at the beginning
            return (boundary_t){.type = E_BOUNDARY_TYPE_LEFT, .pos = pos};
        } else if (pos == len - 1 && is_word_char(str[pos])) {
            // if the last character is word-y, we have a right boundary left of the null terminator
            return (boundary_t){.type = E_BOUNDARY_TYPE_RIGHT, .pos = pos + 1};
        } else if (!is_word_char(str[pos]) && is_word_char(str[pos + 1])) {
            // if we have a delimiter followed by a word char, we have a left boundary left of the word char
            return (boundary_t){.type = E_BOUNDARY_TYPE_LEFT, .pos = pos + 1};
        } else if (is_word_char(str[pos]) && !is_word_char(str[pos + 1])) {
            // if we have a word char followed by a delimiter, we have a right boundary right of the word char
            return (boundary_t){.type = E_BOUNDARY_TYPE_RIGHT, .pos = pos + 1};
        }
        return (boundary_t){.type = E_BOUNDARY_TYPE_NONE};
    }
}

int main() {
    string str;
    getline(cin, str);

    int len = str.length();
    for (int i = 0; i < len; i++) {
        boundary_t boundary = maybe_word_boundary(str, i);
        if (boundary.type == E_BOUNDARY_TYPE_LEFT) {
            // whatever
        } else if (boundary.type == E_BOUNDARY_TYPE_RIGHT) {
            // whatever
        }
    }
}

As you can see, the code is very simple to understand and fine tune, and the actual usage of the code is very short and simple. Using C++ should not stop us from writing the simplest and most readily customized code possible, even if that means not using the STL. I would think this is an instance of what Linus Torvalds might call "taste", since we have eliminated all the logic we don't need while writing in a style that naturally allows more cases to be handled when and if the need to handle them arises.

What could improve this code might be the use of enum class, accepting a function pointer to is_word_char in maybe_word_boundary instead of invoking is_word_char directly, and passing a lambda.

Solution 67 - C++

C++17 version without any memory allocation (except may be for std::function)

void iter_words(const std::string_view& input, const std::function<void(std::string_view)>& process_word) {

    auto itr = input.begin();

    auto consume_whitespace = [&]() {
        for(; itr != input.end(); ++itr) {
            if(!isspace(*itr))
                return;
        }
    };

    auto consume_letters = [&]() {
        for(; itr != input.end(); ++itr) {
            if(isspace(*itr))
                return;
        }
    };

    while(true) {
        consume_whitespace();
        if(itr == input.end())
            return;
        auto word_start = itr - input.begin();
        consume_letters();
        auto word_end = itr - input.begin();
        process_word(input.substr(word_start, word_end - word_start));
    }
}

int main() {
    iter_words("foo bar", [](std::string_view sv) {
        std::cout << "Got word: " <<  sv << '\n';
    });
    return 0;
}

Solution 68 - C++

A minimal solution is a function which takes as input a std::string and a set of delimiter characters (as a std::string), and returns a std::vector of std::strings.

#include <string>
#include <vector>

std::vector<std::string>
tokenize(const std::string& str, const std::string& delimiters)
{
  using ssize_t = std::string::size_type;
  const ssize_t str_ln = str.length();
  ssize_t last_pos = 0;

  // container for the extracted tokens
  std::vector<std::string> tokens;

  while (last_pos < str_ln) {
      // find the position of the next delimiter
      ssize_t pos = str.find_first_of(delimiters, last_pos);

      // if no delimiters found, set the position to the length of string
      if (pos == std::string::npos)
	     pos = str_ln;

      // if the substring is nonempty, store it in the container
      if (pos != last_pos)
	     tokens.emplace_back(str.substr(last_pos, pos - last_pos));

      // scan past the previous substring
      last_pos = pos + 1;
  }

  return tokens;
}

A usage example:

#include <iostream>

int main()
{
    std::string input_str = "one + two * (three - four)!!---! ";
    const char* delimiters = "! +- (*)";
    std::vector<std::string> tokens = tokenize(input_str, delimiters);

    std::cout << "input = '" << input_str << "'\n"
              << "delimiters = '" << delimiters << "'\n"
              << "nr of tokens found = " << tokens.size() << std::endl;
    for (const std::string& tk : tokens) {
        std::cout << "token = '" << tk << "'\n";
    }

  return 0;
}

Solution 69 - C++

My code is:

#include <list>
#include <string>
template<class StringType = std::string, class ContainerType = std::list<StringType> >
class DSplitString:public ContainerType
{
public:
	explicit DSplitString(const StringType& strString, char cChar, bool bSkipEmptyParts = true)
	{
		size_t iPos = 0;
		size_t iPos_char = 0;
		while(StringType::npos != (iPos_char = strString.find(cChar, iPos)))
		{
			StringType strTemp = strString.substr(iPos, iPos_char - iPos);
			if((bSkipEmptyParts && !strTemp.empty()) || (!bSkipEmptyParts))
				push_back(strTemp);
			iPos = iPos_char + 1;
		}
	}
	explicit DSplitString(const StringType& strString, const StringType& strSub, bool bSkipEmptyParts = true)
	{
		size_t iPos = 0;
		size_t iPos_char = 0;
		while(StringType::npos != (iPos_char = strString.find(strSub, iPos)))
		{
			StringType strTemp = strString.substr(iPos, iPos_char - iPos);
			if((bSkipEmptyParts && !strTemp.empty()) || (!bSkipEmptyParts))
				push_back(strTemp);
			iPos = iPos_char + strSub.length();
		}
	}
};

Example:

#include <iostream>
#include <string>
int _tmain(int argc, _TCHAR* argv[])
{
	DSplitString<> aa("doicanhden1;doicanhden2;doicanhden3;", ';');
	for each (std::string var in aa)
	{
		std::cout << var << std::endl;
	}
	std::cin.get();
	return 0;
}

Solution 70 - C++

My implementation can be an alternative solution:

std::vector<std::wstring> SplitString(const std::wstring & String, const std::wstring & Seperator)
{
	std::vector<std::wstring> Lines;
	size_t stSearchPos = 0;
	size_t stFoundPos;
	while (stSearchPos < String.size() - 1)
	{
		stFoundPos = String.find(Seperator, stSearchPos);
		stFoundPos = (stFoundPos == std::string::npos) ? String.size() : stFoundPos;
		Lines.push_back(String.substr(stSearchPos, stFoundPos - stSearchPos));
		stSearchPos = stFoundPos + Seperator.size();
	}
	return Lines;
}

Test code:

std::wstring MyString(L"Part 1SEPsecond partSEPlast partSEPend");
std::vector<std::wstring> Parts = IniFile::SplitString(MyString, L"SEP");
std::wcout << L"The string: " << MyString << std::endl;
for (std::vector<std::wstring>::const_iterator it=Parts.begin(); it<Parts.end(); ++it)
{
	std::wcout << *it << L"<---" << std::endl;
}
std::wcout << std::endl;
MyString = L"this,time,a,comma separated,string";
std::wcout << L"The string: " << MyString << std::endl;
Parts = IniFile::SplitString(MyString, L",");
for (std::vector<std::wstring>::const_iterator it=Parts.begin(); it<Parts.end(); ++it)
{
	std::wcout << *it << L"<---" << std::endl;
}

Output of the test code:

The string: Part 1SEPsecond partSEPlast partSEPend
Part 1<---
second part<---
last part<---
end<---

The string: this,time,a,comma separated,string
this<---
time<---
a<---
comma separated<---
string<---

Solution 71 - C++

very late to the party here I know but I was thinking about the most elegant way of doing this if you were given a range of delimiters rather than whitespace, and using nothing more than the standard library.

Here are my thoughts:

To split words into a string vector by a sequence of delimiters:

template<class Container>
std::vector<std::string> split_by_delimiters(const std::string& input, const Container& delimiters)
{
    std::vector<std::string> result;

    for (auto current = begin(input) ; current != end(input) ; )
    {
        auto first = find_if(current, end(input), not_in(delimiters));
        if (first == end(input)) break;
        auto last = find_if(first, end(input), is_in(delimiters));
        result.emplace_back(first, last);
        current = last;
    }
    return result;
}

to split the other way, by providing a sequence of valid characters:

template<class Container>
std::vector<std::string> split_by_valid_chars(const std::string& input, const Container& valid_chars)
{
    std::vector<std::string> result;

    for (auto current = begin(input) ; current != end(input) ; )
    {
        auto first = find_if(current, end(input), is_in(valid_chars));
        if (first == end(input)) break;
        auto last = find_if(first, end(input), not_in(valid_chars));
        result.emplace_back(first, last);
        current = last;
    }
    return result;
}

is_in and not_in are defined thus:

namespace detail {
    template<class Container>
    struct is_in {
        is_in(const Container& charset)
        : _charset(charset)
        {}

        bool operator()(char c) const
        {
            return find(begin(_charset), end(_charset), c) != end(_charset);
        }

        const Container& _charset;
    };

    template<class Container>
    struct not_in {
        not_in(const Container& charset)
        : _charset(charset)
        {}

        bool operator()(char c) const
        {
            return find(begin(_charset), end(_charset), c) == end(_charset);
        }

        const Container& _charset;
    };

}

template<class Container>
detail::not_in<Container> not_in(const Container& c)
{
    return detail::not_in<Container>(c);
}

template<class Container>
detail::is_in<Container> is_in(const Container& c)
{
    return detail::is_in<Container>(c);
}

Solution 72 - C++

Thank you @Jairo Abdiel Toribio Cisneros. It works for me but your function return some empty element. So for return without empty I have edited with the following:

std::vector<std::string> split(std::string str, const char* delim) {
    std::vector<std::string> v;
    std::string tmp;
    
    for(std::string::const_iterator i = str.begin(); i <= str.end(); ++i) {
        if(*i != *delim && i != str.end()) {
            tmp += *i;
        } else {
            if (tmp.length() > 0) {
                v.push_back(tmp);
            }
            tmp = "";
        }
    }
    
    return v;
}

Using:

std::string s = "one:two::three";
std::string delim = ":";
std::vector<std::string> vv = split(s, delim.c_str());

Solution 73 - C++

if you want split string by some chars you can use

#include<iostream>
#include<string>
#include<vector>
#include<iterator>
#include<sstream>
#include<string>

using namespace std;
void replaceOtherChars(string &input, vector<char> &dividers)
{
	const char divider = dividers.at(0);
	int replaceIndex = 0;
	vector<char>::iterator it_begin = dividers.begin()+1,
		it_end= dividers.end();
	for(;it_begin!=it_end;++it_begin)
	{
		replaceIndex = 0;
		while(true)
		{
			replaceIndex=input.find_first_of(*it_begin,replaceIndex);
			if(replaceIndex==-1)
				break;
			input.at(replaceIndex)=divider;
		}
	}
}
vector<string> split(string str, vector<char> chars, bool missEmptySpace =true )
{
	vector<string> result;
	const char divider = chars.at(0);
	replaceOtherChars(str,chars);
	stringstream stream;
	stream<<str;	
    string temp;
	while(getline(stream,temp,divider))
	{
		if(missEmptySpace && temp.empty())
			continue;
		result.push_back(temp);
	}
	return result;
}
int main()
{
	string str ="milk, pigs.... hot-dogs ";
	vector<char> arr;
	arr.push_back(' ');	arr.push_back(',');	arr.push_back('.');
	vector<string> result = split(str,arr);
	vector<string>::iterator it_begin= result.begin(),
		it_end= result.end();
	for(;it_begin!=it_end;++it_begin)
	{
		cout<<*it_begin<<endl;
	}
return 0;
}

Solution 74 - C++

This is an extension of one of the top answers. It now supports setting a max number of returned elements, N. The last bit of the string will end up in the Nth element. The MAXELEMENTS parameter is optional, if set at default 0 it will return an unlimited amount of elements. :-)

.h:

class Myneatclass {
public:
	static std::vector<std::string>& split(const std::string &s, char delim, std::vector<std::string> &elems, const size_t MAXELEMENTS = 0);
	static std::vector<std::string> split(const std::string &s, char delim, const size_t MAXELEMENTS = 0);
};

.cpp:

std::vector<std::string>& Myneatclass::split(const std::string &s, char delim, std::vector<std::string> &elems, const size_t MAXELEMENTS) {
	std::stringstream ss(s);
	std::string item;
	while (std::getline(ss, item, delim)) {
		elems.push_back(item);
		if (MAXELEMENTS > 0 && !ss.eof() && elems.size() + 1 >= MAXELEMENTS) {
			std::getline(ss, item);
			elems.push_back(item);
			break;
		}
	}
	return elems;
}
std::vector<std::string> Myneatclass::split(const std::string &s, char delim, const size_t MAXELEMENTS) {
	std::vector<std::string> elems;
	split(s, delim, elems, MAXELEMENTS);
	return elems;
}

Solution 75 - C++

my general implementation for string and u32string ~, using the boost::algorithm::split signature.

template<typename CharT, typename UnaryPredicate>
void split(std::vector<std::basic_string<CharT>>& split_result,
           const std::basic_string<CharT>& s,
           UnaryPredicate predicate)
{
    using ST = std::basic_string<CharT>;
    using std::swap;
    std::vector<ST> tmp_result;
    auto iter = s.cbegin(),
         end_iter = s.cend();
    while (true)
    {
        /**
         * edge case: empty str -> push an empty str and exit.
         */
        auto find_iter = find_if(iter, end_iter, predicate);
        tmp_result.emplace_back(iter, find_iter);
        if (find_iter == end_iter) { break; }
        iter = ++find_iter; 
    }
    swap(tmp_result, split_result);
}


template<typename CharT>
void split(std::vector<std::basic_string<CharT>>& split_result,
           const std::basic_string<CharT>& s,
           const std::basic_string<CharT>& char_candidate)
{
    std::unordered_set<CharT> candidate_set(char_candidate.cbegin(),
                                            char_candidate.cend());
    auto predicate = [&candidate_set](const CharT& c) {
        return candidate_set.count(c) > 0U;
    };
    return split(split_result, s, predicate);
}

template<typename CharT>
void split(std::vector<std::basic_string<CharT>>& split_result,
           const std::basic_string<CharT>& s,
           const CharT* literals)
{
    return split(split_result, s, std::basic_string<CharT>(literals));
}

Solution 76 - C++

#include <iostream>
#include <string>
#include <deque>

std::deque<std::string> split(
	const std::string& line, 
	std::string::value_type delimiter,
	bool skipEmpty = false
) {
	std::deque<std::string> parts{};

	if (!skipEmpty && !line.empty() && delimiter == line.at(0)) {
		parts.push_back({});
	}

	for (const std::string::value_type& c : line) {
		if (
			(
				c == delimiter 
				&&
				(skipEmpty ? (!parts.empty() && !parts.back().empty()) : true)
			)
			||
			(c != delimiter && parts.empty())
		) {
			parts.push_back({});
		}

		if (c != delimiter) {
			parts.back().push_back(c);
		}
	}

	if (skipEmpty && !parts.empty() && parts.back().empty()) {
		parts.pop_back();
	}

	return parts;
}

void test(const std::string& line) {
	std::cout << line << std::endl;
	
	std::cout << "skipEmpty=0 |";
	for (const std::string& part : split(line, ':')) {
		std::cout << part << '|';
	}
	std::cout << std::endl;

	std::cout << "skipEmpty=1 |";
	for (const std::string& part : split(line, ':', true)) {
		std::cout << part << '|';
	}
	std::cout << std::endl;

	std::cout << std::endl;
}

int main() {
	test("foo:bar:::baz");
	test("");
	test("foo");
	test(":");
	test("::");
	test(":foo");
	test("::foo");
	test(":foo:");
	test(":foo::");

	return 0;
}

Output:

foo:bar:::baz
skipEmpty=0 |foo|bar|||baz|
skipEmpty=1 |foo|bar|baz|


skipEmpty=0 |
skipEmpty=1 |

foo
skipEmpty=0 |foo|
skipEmpty=1 |foo|

:
skipEmpty=0 |||
skipEmpty=1 |

::
skipEmpty=0 ||||
skipEmpty=1 |

:foo
skipEmpty=0 ||foo|
skipEmpty=1 |foo|

::foo
skipEmpty=0 |||foo|
skipEmpty=1 |foo|

:foo:
skipEmpty=0 ||foo||
skipEmpty=1 |foo|

:foo::
skipEmpty=0 ||foo|||
skipEmpty=1 |foo|

Solution 77 - C++

There's a way easier method to do this!!

#include <vector>
#include <string>
std::vector<std::string> splitby(std::string string, char splitter) {
	int splits = 0;
	std::vector<std::string> result = {};
	std::string locresult = "";
	for (unsigned int i = 0; i < string.size(); i++) {
		if ((char)string.at(i) != splitter) {
			locresult += string.at(i);
		}
		else {
			result.push_back(locresult);
			locresult = "";
		}
	}
	if (splits == 0) {
		result.push_back(locresult);
	}
	return result;
}

void printvector(std::vector<std::string> v) {
	std::cout << '{';
	for (unsigned int i = 0; i < v.size(); i++) {
		if (i < v.size() - 1) {
			std::cout << '"' << v.at(i) << "\",";
		}
		else {
			std::cout << '"' << v.at(i) << "\"";
		}
	}
	std::cout << "}\n";
}

Solution 78 - C++

Here's my approach, cut and split:

string cut (string& str, const string& del)
{
    string f = str;

    if (in.find_first_of(del) != string::npos)
    {
        f = str.substr(0,str.find_first_of(del));
        str = str.substr(str.find_first_of(del)+del.length());
    }

    return f;
}

vector<string> split (const string& in, const string& del=" ")
{
    vector<string> out();
    string t = in;
    
    while (t.length() > del.length())
        out.push_back(cut(t,del));
    
    return out;
}

BTW, if there's something I can do to optimize this ..

Solution 79 - C++

Not that we need more answers, but this is what I came up with after being inspired by Evan Teran.

std::vector <std::string> split(const string &input, auto delimiter, bool skipEmpty=true) {
  /*
  Splits a string at each delimiter and returns these strings as a string vector.
  If the delimiter is not found then nothing is returned.
  If skipEmpty is true then strings between delimiters that are 0 in length will be skipped.
  */
  bool delimiterFound = false;
  int pos=0, pPos=0;
  std::vector <std::string> result;
  while (true) {
    pos = input.find(delimiter,pPos);
    if (pos != std::string::npos) {
      if (skipEmpty==false or pos-pPos > 0) // if empty values are to be kept or not
        result.push_back(input.substr(pPos,pos-pPos));
      delimiterFound = true;
    } else {
      if (pPos < input.length() and delimiterFound) {
        if (skipEmpty==false or input.length()-pPos > 0) // if empty values are to be kept or not
          result.push_back(input.substr(pPos,input.length()-pPos));
      }
      break;
    }
    pPos = pos+1;
  }
  return result;
}

Solution 80 - C++

#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <vector>

int main() {
    using namespace std;
   int n=8;
    string sentence = "10 20 30 40 5 6 7 8";
    istringstream iss(sentence);

  vector<string> tokens;
copy(istream_iterator<string>(iss),
     istream_iterator<string>(),
     back_inserter(tokens));

     for(int i=0;i<n;i++){
        cout<<tokens.at(i);
     }
     

}

Solution 81 - C++

void splitString(string str, char delim, string array[], const int arraySize)
{
    int delimPosition, subStrSize, subStrStart = 0;

    for (int index = 0; delimPosition != -1; index++)
    {
        delimPosition = str.find(delim, subStrStart);
        subStrSize = delimPosition - subStrStart;
        array[index] = str.substr(subStrStart, subStrSize);
        subStrStart =+ (delimPosition + 1);
    }
}

Solution 82 - C++

For a ridiculously large and probably redundant version, try a lot of for loops.

string stringlist[10];
int count = 0;

for (int i = 0; i < sequence.length(); i++)
{
	if (sequence[i] == ' ')
	{
		stringlist[count] = sequence.substr(0, i);
		sequence.erase(0, i+1);
		i = 0;
		count++;
	}
	else if (i == sequence.length()-1)	// Last word
	{
		stringlist[count] = sequence.substr(0, i+1);
	}
}

It isn't pretty, but by and large (Barring punctuation and a slew of other bugs) it works!

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAshwin NanjappaView Question on Stackoverflow
Solution 1 - C++Evan TeranView Answer on Stackoverflow
Solution 2 - C++ZuninoView Answer on Stackoverflow
Solution 3 - C++ididakView Answer on Stackoverflow
Solution 4 - C++kevView Answer on Stackoverflow
Solution 5 - C++MariusView Answer on Stackoverflow
Solution 6 - C++Alec ThomasView Answer on Stackoverflow
Solution 7 - C++gnomedView Answer on Stackoverflow
Solution 8 - C++FerruccioView Answer on Stackoverflow
Solution 9 - C++Shadow2531View Answer on Stackoverflow
Solution 10 - C++user19302View Answer on Stackoverflow
Solution 11 - C++Marco M.View Answer on Stackoverflow
Solution 12 - C++rhomuView Answer on Stackoverflow
Solution 13 - C++RobertView Answer on Stackoverflow
Solution 14 - C++dk123View Answer on Stackoverflow
Solution 15 - C++KTCView Answer on Stackoverflow
Solution 16 - C++zermView Answer on Stackoverflow
Solution 17 - C++AJMansfieldView Answer on Stackoverflow
Solution 18 - C++Pratik DeoghareView Answer on Stackoverflow
Solution 19 - C++lukmacView Answer on Stackoverflow
Solution 20 - C++Porsche9IIView Answer on Stackoverflow
Solution 21 - C++J. WillusView Answer on Stackoverflow
Solution 22 - C++GoranView Answer on Stackoverflow
Solution 23 - C++user1438233View Answer on Stackoverflow
Solution 24 - C++DannyKView Answer on Stackoverflow
Solution 25 - C++Steve DellView Answer on Stackoverflow
Solution 26 - C++NL628View Answer on Stackoverflow
Solution 27 - C++gibbzView Answer on Stackoverflow
Solution 28 - C++user246110View Answer on Stackoverflow
Solution 29 - C++Marty BView Answer on Stackoverflow
Solution 30 - C++Andreas SpindlerView Answer on Stackoverflow
Solution 31 - C++san45View Answer on Stackoverflow
Solution 32 - C++Software_DesignerView Answer on Stackoverflow
Solution 33 - C++user1134181View Answer on Stackoverflow
Solution 34 - C++solstice333View Answer on Stackoverflow
Solution 35 - C++AbeView Answer on Stackoverflow
Solution 36 - C++Nur BijoyView Answer on Stackoverflow
Solution 37 - C++Kelly EltonView Answer on Stackoverflow
Solution 38 - C++Jim HuangView Answer on Stackoverflow
Solution 39 - C++GalikView Answer on Stackoverflow
Solution 40 - C++Dietmar KühlView Answer on Stackoverflow
Solution 41 - C++JehjoaView Answer on Stackoverflow
Solution 42 - C++Sam BView Answer on Stackoverflow
Solution 43 - C++Jairo Abdiel Toribio CisnerosView Answer on Stackoverflow
Solution 44 - C++KaznovView Answer on Stackoverflow
Solution 45 - C++DmitryView Answer on Stackoverflow
Solution 46 - C++Venkata Naidu MView Answer on Stackoverflow
Solution 47 - C++pz64_View Answer on Stackoverflow
Solution 48 - C++TimmmmView Answer on Stackoverflow
Solution 49 - C++landenView Answer on Stackoverflow
Solution 50 - C++ManiPView Answer on Stackoverflow
Solution 51 - C++UserView Answer on Stackoverflow
Solution 52 - C++smac89View Answer on Stackoverflow
Solution 53 - C++robcsiView Answer on Stackoverflow
Solution 54 - C++KazView Answer on Stackoverflow
Solution 55 - C++mchiassonView Answer on Stackoverflow
Solution 56 - C++Tristan BrindleView Answer on Stackoverflow
Solution 57 - C++yunhasnawaView Answer on Stackoverflow
Solution 58 - C++Saksham SharmaView Answer on Stackoverflow
Solution 59 - C++RomárioView Answer on Stackoverflow
Solution 60 - C++lemicView Answer on Stackoverflow
Solution 61 - C++LLLLView Answer on Stackoverflow
Solution 62 - C++user2588062View Answer on Stackoverflow
Solution 63 - C++tony gilView Answer on Stackoverflow
Solution 64 - C++torayeffView Answer on Stackoverflow
Solution 65 - C++AlwaysLearningView Answer on Stackoverflow
Solution 66 - C++okovkoView Answer on Stackoverflow
Solution 67 - C++balkiView Answer on Stackoverflow
Solution 68 - C++AlQuemistView Answer on Stackoverflow
Solution 69 - C++doicanhdenView Answer on Stackoverflow
Solution 70 - C++hkBattousaiView Answer on Stackoverflow
Solution 71 - C++Richard HodgesView Answer on Stackoverflow
Solution 72 - C++KakashiView Answer on Stackoverflow
Solution 73 - C++BushuevView Answer on Stackoverflow
Solution 74 - C++JonnyView Answer on Stackoverflow
Solution 75 - C++小文件View Answer on Stackoverflow
Solution 76 - C++OlegView Answer on Stackoverflow
Solution 77 - C++user10873315View Answer on Stackoverflow
Solution 78 - C++Khaled.KView Answer on Stackoverflow
Solution 79 - C++Joakim L. ChristiansenView Answer on Stackoverflow
Solution 80 - C++abe312View Answer on Stackoverflow
Solution 81 - C++user1877322View Answer on Stackoverflow
Solution 82 - C++Peter C.View Answer on Stackoverflow