Split string in Lua?

StringLua

String Problem Overview


I need to do a simple split of a string, but there doesn't seem to be a function for this, and the manual way I tested didn't seem to work. How would I do it?

String Solutions


Solution 1 - String

Here is my really simple solution. Use the gmatch function to capture strings which contain at least one character of anything other than the desired separator. The separator is *any whitespace (%s in Lua) by default:

function mysplit (inputstr, sep)
        if sep == nil then
                sep = "%s"
        end
        local t={}
        for str in string.gmatch(inputstr, "([^"..sep.."]+)") do
                table.insert(t, str)
        end
        return t
end

.

Solution 2 - String

If you are splitting a string in Lua, you should try the string.gmatch() or string.sub() methods. Use the string.sub() method if you know the index you wish to split the string at, or use the string.gmatch() if you will parse the string to find the location to split the string at.

Example using string.gmatch() from Lua 5.1 Reference Manual:

 t = {}
 s = "from=world, to=Lua"
 for k, v in string.gmatch(s, "(%w+)=(%w+)") do
   t[k] = v
 end

Solution 3 - String

If you just want to iterate over the tokens, this is pretty neat:

line = "one, two and 3!"

for token in string.gmatch(line, "[^%s]+") do
   print(token)
end

Output:

>one, > >two > >and > > 3!

Short explanation: the "[^%s]+" pattern matches to every non-empty string in between space characters.

Solution 4 - String

If you program in Lua, you are out of luck here. Lua is THE one programming language that just happens to be notoriously infamous because its authors never implemented "the" split function in the standard library, and instead wrote 16 screenfulls of explanations and lame excuses as to why they didn't and wouldn't, interspersed with numerous half-working examples that are virtually guaranteed to work for almost everyone but break in your corner case. This is just Lua state of the art, and everyone who programs in Lua simply ends up clenching their teeth and iterating over characters. There are lots of solutions in existence that are sometimes better, but exactly zero solutions that are reliably better.

Solution 5 - String

Just as string.gmatch will find patterns in a string, this function will find the things between patterns:

function string:split(pat)
  pat = pat or '%s+'
  local st, g = 1, self:gmatch("()("..pat..")")
  local function getter(segs, seps, sep, cap1, ...)
    st = sep and seps + #sep
    return self:sub(segs, (seps or 0) - 1), cap1 or sep, ...
  end
  return function() if st then return getter(st, g()) end end
end

By default it returns whatever is separated by whitespace.

Solution 6 - String

Here is the function:

function split(pString, pPattern)
   local Table = {}  -- NOTE: use {n = 0} in Lua-5.0
   local fpat = "(.-)" .. pPattern
   local last_end = 1
   local s, e, cap = pString:find(fpat, 1)
   while s do
      if s ~= 1 or cap ~= "" then
	 table.insert(Table,cap)
      end
      last_end = e+1
      s, e, cap = pString:find(fpat, last_end)
   end
   if last_end <= #pString then
      cap = pString:sub(last_end)
      table.insert(Table, cap)
   end
   return Table
end

Call it like:

list=split(string_to_split,pattern_to_match)

e.g.:

list=split("1:2:3:4","\:")


For more go here:
http://lua-users.org/wiki/SplitJoin

Solution 7 - String

Because there are more than one way to skin a cat, here's my approach:

Code:

#!/usr/bin/env lua

local content = [=[
Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna 
aliqua. Ut enim ad minim veniam, quis nostrud exercitation 
ullamco laboris nisi ut aliquip ex ea commodo consequat.
]=]

local function split(str, sep)
   local result = {}
   local regex = ("([^%s]+)"):format(sep)
   for each in str:gmatch(regex) do
      table.insert(result, each)
   end
   return result
end

local lines = split(content, "\n")
for _,line in ipairs(lines) do
   print(line)
end

Output:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, 
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat.

Explanation:

The gmatch function works as an iterator, it fetches all the strings that match regex. The regex takes all characters until it finds a separator.

Solution 8 - String

I like this short solution

function split(s, delimiter)
    result = {};
    for match in (s..delimiter):gmatch("(.-)"..delimiter) do
        table.insert(result, match);
    end
    return result;
end

Solution 9 - String

A lot of these answers only accept single-character separators, or don't deal with edge cases well (e.g. empty separators), so I thought I would provide a more definitive solution.

Here are two functions, gsplit and split, adapted from the code in the Scribunto MediaWiki extension, which is used on wikis like Wikipedia. The code is licenced under the GPL v2. I have changed the variable names and added comments to make the code a bit easier to understand, and I have also changed the code to use regular Lua string patterns instead of Scribunto's patterns for Unicode strings. The original code has test cases here.

-- gsplit: iterate over substrings in a string separated by a pattern
-- 
-- Parameters:
-- text (string)    - the string to iterate over
-- pattern (string) - the separator pattern
-- plain (boolean)  - if true (or truthy), pattern is interpreted as a plain
--                    string, not a Lua pattern
-- 
-- Returns: iterator
--
-- Usage:
-- for substr in gsplit(text, pattern, plain) do
--   doSomething(substr)
-- end
local function gsplit(text, pattern, plain)
  local splitStart, length = 1, #text
  return function ()
    if splitStart then
      local sepStart, sepEnd = string.find(text, pattern, splitStart, plain)
      local ret
      if not sepStart then
        ret = string.sub(text, splitStart)
        splitStart = nil
      elseif sepEnd < sepStart then
        -- Empty separator!
        ret = string.sub(text, splitStart, sepStart)
        if sepStart < length then
          splitStart = sepStart + 1
        else
          splitStart = nil
        end
      else
        ret = sepStart > splitStart and string.sub(text, splitStart, sepStart - 1) or ''
        splitStart = sepEnd + 1
      end
      return ret
    end
  end
end

-- split: split a string into substrings separated by a pattern.
-- 
-- Parameters:
-- text (string)    - the string to iterate over
-- pattern (string) - the separator pattern
-- plain (boolean)  - if true (or truthy), pattern is interpreted as a plain
--                    string, not a Lua pattern
-- 
-- Returns: table (a sequence table containing the substrings)
local function split(text, pattern, plain)
  local ret = {}
  for match in gsplit(text, pattern, plain) do
    table.insert(ret, match)
  end
  return ret
end

Some examples of the split function in use:

local function printSequence(t)
  print(unpack(t))
end

printSequence(split('foo, bar,baz', ',%s*'))       -- foo     bar     baz
printSequence(split('foo, bar,baz', ',%s*', true)) -- foo, bar,baz
printSequence(split('foo', ''))                    -- f       o       o

Solution 10 - String

You can use this method:

function string:split(delimiter)
  local result = { }
  local from  = 1
  local delim_from, delim_to = string.find( self, delimiter, from  )
  while delim_from do
    table.insert( result, string.sub( self, from , delim_from-1 ) )
    from  = delim_to + 1
    delim_from, delim_to = string.find( self, delimiter, from  )
  end
  table.insert( result, string.sub( self, from  ) )
  return result
end

delimiter = string.split(stringtodelimite,pattern) 

Solution 11 - String

a way not seen in others

function str_split(str, sep)
    if sep == nil then
        sep = '%s'
    end 

    local res = {}
    local func = function(w)
        table.insert(res, w)
    end 

    string.gsub(str, '[^'..sep..']+', func)
    return res 
end

Solution 12 - String

Simply sitting on a delimiter

local str = 'one,two'
local regxEverythingExceptComma = '([^,]+)'
for x in string.gmatch(str, regxEverythingExceptComma) do
    print(x)
end

Solution 13 - String

You could use penlight library. This has a function for splitting string using delimiter which outputs list.

It has implemented many of the function that we may need while programming and missing in Lua.

Here is the sample for using it.

> 
> stringx = require "pl.stringx"
> 
> str = "welcome to the world of lua"
> 
> arr = stringx.split(str, " ")
> 
> arr
{welcome,to,the,world,of,lua}
> 

Solution 14 - String

I used the above examples to craft my own function. But the missing piece for me was automatically escaping magic characters.

Here is my contribution:

function split(text, delim)
    -- returns an array of fields based on text and delimiter (one character only)
    local result = {}
    local magic = "().%+-*?[]^$"

    if delim == nil then
        delim = "%s"
    elseif string.find(delim, magic, 1, true) then
        -- escape magic
        delim = "%"..delim
    end

    local pattern = "[^"..delim.."]+"
    for w in string.gmatch(text, pattern) do
        table.insert(result, w)
    end
    return result
end

Solution 15 - String

Super late to this question, but in case anyone wants a version that handles the amount of splits you want to get.....

-- Split a string into a table using a delimiter and a limit
string.split = function(str, pat, limit)
  local t = {}
  local fpat = "(.-)" .. pat
  local last_end = 1
  local s, e, cap = str:find(fpat, 1)
  while s do
    if s ~= 1 or cap ~= "" then
      table.insert(t, cap)
    end
    
    last_end = e+1
    s, e, cap = str:find(fpat, last_end)

    if limit ~= nil and limit <= #t then
      break
    end
  end

  if last_end <= #str then
    cap = str:sub(last_end)
    table.insert(t, cap)
  end

  return t
end

Solution 16 - String

Depending on the use case, this could be useful. It cuts all text either side of the flags:

b = "This is a string used for testing"

--Removes unwanted text
c = (b:match("a([^/]+)used"))

print (c)

Output:

string

Solution 17 - String

Here is a routine that works in Lua 4.0, returning a table t of the substrings in inputstr delimited by sep:

function string_split(inputstr, sep)
	local inputstr = inputstr .. sep
	local idx, inc, t = 0, 1, {}
	local idx_prev, substr
	repeat 
		idx_prev = idx
		inputstr = strsub(inputstr, idx + 1, -1)	-- chop off the beginning of the string containing the match last found by strfind (or initially, nothing); keep the rest (or initially, all)
		idx = strfind(inputstr, sep)				-- find the 0-based r_index of the first occurrence of separator 
		if idx == nil then break end 				-- quit if nothing's found
		substr = strsub(inputstr, 0, idx)			-- extract the substring occurring before the separator (i.e., data field before the next delimiter)
		substr = gsub(substr, "[%c" .. sep .. " ]", "")	-- eliminate control characters, separator and spaces
		t[inc] = substr				-- store the substring (i.e., data field)
		inc = inc + 1				-- iterate to next
	until idx == nil
	return t
end

This simple test

inputstr = "the brown lazy fox jumped over the fat grey hen ... or something."
sep = " " 
t = {}
t = string_split(inputstr,sep)
for i=1,15 do
	print(i, t[i])
end

Yields:

--> t[1]=the
--> t[2]=brown
--> t[3]=lazy
--> t[4]=fox
--> t[5]=jumped
--> t[6]=over
--> t[7]=the
--> t[8]=fat
--> t[9]=grey
--> t[10]=hen
--> t[11]=...
--> t[12]=or
--> t[13]=something.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRCIXView Question on Stackoverflow
Solution 1 - Stringuser973713View Answer on Stackoverflow
Solution 2 - StringgwellView Answer on Stackoverflow
Solution 3 - StringHugoView Answer on Stackoverflow
Solution 4 - StringSzczepan HołyszewskiView Answer on Stackoverflow
Solution 5 - StringNorman RamseyView Answer on Stackoverflow
Solution 6 - StringFaisal HanifView Answer on Stackoverflow
Solution 7 - StringDiego PinoView Answer on Stackoverflow
Solution 8 - StringIvo BeckersView Answer on Stackoverflow
Solution 9 - StringJack TaylorView Answer on Stackoverflow
Solution 10 - Stringkrsk9999View Answer on Stackoverflow
Solution 11 - StringHohenheimView Answer on Stackoverflow
Solution 12 - StringJerome AnthonyView Answer on Stackoverflow
Solution 13 - Stringuser11464249View Answer on Stackoverflow
Solution 14 - StringintrepidheroView Answer on Stackoverflow
Solution 15 - StringBenjamin VisonView Answer on Stackoverflow
Solution 16 - StringgreenageView Answer on Stackoverflow
Solution 17 - StringGraham GundersonView Answer on Stackoverflow