Bash: Split string into character array

StringBash

String Problem Overview


I have a string in a Bash shell script that I want to split into an array of characters, not based on a delimiter but just one character per array index. How can I do this? Ideally it would not use any external programs. Let me rephrase that. My goal is portability, so things like sed that are likely to be on any POSIX compatible system are fine.

String Solutions


Solution 1 - String

Try

echo "abcdefg" | fold -w1

Edit: Added a more elegant solution suggested in comments.

echo "abcdefg" | grep -o .

Solution 2 - String

You can access each letter individually already without an array conversion:

$ foo="bar"
$ echo ${foo:0:1}
b
$ echo ${foo:1:1}
a
$ echo ${foo:2:1}
r

If that's not enough, you could use something like this:

$ bar=($(echo $foo|sed  's/\(.\)/\1 /g'))
$ echo ${bar[1]}
a

If you can't even use sed or something like that, you can use the first technique above combined with a while loop using the original string's length (${#foo}) to build the array.

Warning: the code below does not work if the string contains whitespace. I think Vaughn Cato's answer has a better chance at surviving with special chars.

thing=($(i=0; while [ $i -lt ${#foo} ] ; do echo ${foo:$i:1} ; i=$((i+1)) ; done))

Solution 3 - String

As an alternative to iterating over 0 .. ${#string}-1 with a for/while loop, there are two other ways I can think of to do this with only bash: using =~ and using printf. (There's a third possibility using eval and a {..} sequence expression, but this lacks clarity.)

With the correct environment and NLS enabled in bash these will work with non-ASCII as hoped, removing potential sources of failure with older system tools such as sed, if that's a concern. These will work from bash-3.0 (released 2005).

Using =~ and regular expressions, converting a string to an array in a single expression:

string="wonkabars"
[[ "$string" =~ ${string//?/(.)} ]]       # splits into array
printf "%s\n" "${BASH_REMATCH[@]:1}"      # loop free: reuse fmtstr
declare -a arr=( "${BASH_REMATCH[@]:1}" ) # copy array for later

The way this works is to perform an expansion of string which substitutes each single character for (.), then match this generated regular expression with grouping to capture each individual character into BASH_REMATCH[]. Index 0 is set to the entire string, since that special array is read-only you cannot remove it, note the :1 when the array is expanded to skip over index 0, if needed. Some quick testing for non-trivial strings (>64 chars) shows this method is substantially faster than one using bash string and array operations.

The above will work with strings containing newlines, =~ supports POSIX ERE where . matches anything except NUL by default, i.e. the regex is compiled without REG_NEWLINE. (The behaviour of POSIX text processing utilities is allowed to be different by default in this respect, and usually is.)

Second option, using printf:

string="wonkabars"
ii=0
while printf "%s%n" "${string:ii++:1}" xx; do 
  ((xx)) && printf "\n" || break
done 

This loop increments index ii to print one character at a time, and breaks out when there are no characters left. This would be even simpler if the bash printf returned the number of character printed (as in C) rather than an error status, instead the number of characters printed is captured in xx using %n. (This works at least back as far as bash-2.05b.)

With bash-3.1 and printf -v var you have slightly more flexibility, and can avoid falling off the end of the string should you be doing something other than printing the characters, e.g. to create an array:

declare -a arr
ii=0
while printf -v cc "%s%n" "${string:(ii++):1}" xx; do 
    ((xx)) && arr+=("$cc") || break
done

Solution 4 - String

If your string is stored in variable x, this produces an array y with the individual characters:

i=0
while [ $i -lt ${#x} ]; do y[$i]=${x:$i:1};  i=$((i+1));done

Solution 5 - String

The most simple, complete and elegant solution:

$ read -a ARRAY <<< $(echo "abcdefg" | sed 's/./& /g')  

and test

$ echo ${ARRAY[0]}
  a

$ echo ${ARRAY[1]}
  b

Explanation: read -a reads the stdin as an array and assigns it to the variable ARRAY treating spaces as delimiter for each array item.

The evaluation of echoing the string to sed just add needed spaces between each character.

We are using Here String (<<<) to feed the stdin of the read command.

Solution 6 - String

I have found that the following works the best:

array=( `echo string | grep -o . ` )

(note the backticks)

then if you do: echo ${array[@]} , you get: s t r i n g

or: echo ${array[2]} , you get: r

Solution 7 - String

Pure Bash solution with no loop:

#!/usr/bin/env bash

str='The quick brown fox jumps over a lazy dog.'

# Need extglob for the replacement pattern
shopt -s extglob

# Split string characters into array (skip first record)
# Character 037 is the octal representation of ASCII Record Separator
# so it can capture all other characters in the string, including spaces.
IFS= mapfile -s1 -t -d $'\37' array <<<"${str//?()/$'\37'}"

# Strip out captured trailing newline of here-string in last record
array[-1]="${array[-1]%?}"

# Debug print array
declare -p array 

Solution 8 - String

string=hello123

for i in $(seq 0 ${#string})
    do array[$i]=${string:$i:1}
done

echo "zero element of array is [${array[0]}]"
echo "entire array is [${array[@]}]"

The zero element of array is [h]. The entire array is [h e l l o 1 2 3 ].

Solution 9 - String

If the text can contain spaces:

eval a=( $(echo "this is a test" | sed "s/\(.\)/'\1' /g") )

Solution 10 - String

$ echo hello | awk NF=NF FS=
h e l l o

Or

$ echo hello | awk '$0=RT' RS=[[:alnum:]]
h
e
l
l
o

Solution 11 - String

Yet another on :), the stated question simply says 'Split string into character array' and don't say much about the state of the receiving array, and don't say much about special chars like and control chars.

My assumption is that if I want to split a string into an array of chars I want the receiving array containing just that string and no left over from previous runs, yet preserve any special chars.

For instance the proposed solution family like

for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done

Have left overs in the target array.

$ y=(1 2 3 4 5 6 7 8)
$ x=abc
$ for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
$ printf '%s ' "${y[@]}"
a b c 4 5 6 7 8 

Beside writing the long line each time we want to split a problem, so why not hide all this into a function we can keep is a package source file, with a API like

s2a "Long string" ArrayName

I got this one that seems to do the job.

$ s2a()
> { [ "$2" ] && typeset -n __=$2 && unset $2;
>   [ "$1" ] && __+=("${1:0:1}") && s2a "${1:1}"
> }

$ a=(1 2 3 4 5 6 7 8 9 0) ; printf '%s ' "${a[@]}"
1 2 3 4 5 6 7 8 9 0 

$ s2a "Split It" a        ; printf '%s ' "${a[@]}"
S p l i t   I t 

Solution 12 - String

If you want to store this in an array, you can do this:

string=foo
unset chars
declare -a chars
while read -N 1
do
    chars[${#chars[@]}]="$REPLY"
done <<<"$string"x
unset chars[$((${#chars[@]} - 1))]
unset chars[$((${#chars[@]} - 1))]

echo "Array: ${chars[@]}"
Array: f o o
echo "Array length: ${#chars[@]}"
Array length: 3

The final x is necessary to handle the fact that a newline is appended after $string if it doesn't contain one.

If you want to use NUL-separated characters, you can try this:

echo -n "$string" | while read -N 1
do
    printf %s "$REPLY"
    printf '\0'
done

Solution 13 - String

AWK is quite convenient:

a='123'; echo $a | awk 'BEGIN{FS="";OFS=" "} {print $1,$2,$3}'

where FS and OFS is delimiter for read-in and print-out

Solution 14 - String

For those who landed here searching how to do this in fish:

We can use the builtin string command (since v2.3.0) for string manipulation.

↪ string split '' abc
a
b
c

The output is a list, so array operations will work.

for c in (string split '' abc)
      echo char is $c
  end
char is a
char is b
char is c

Here's a more complex example iterating over the string with an index.

↪ set --local chars (string split '' abc)
  for i in (seq (count $chars))
      echo $i: $chars[$i]
  end
1: a
2: b
3: c

Solution 15 - String

If you also need support for strings with newlines, you can do:

str2arr(){ local string="$1"; mapfile -d $'\0' Chars < <(for i in $(seq 0 $((${#string}-1))); do printf '%s\u0000' "${string:$i:1}"; done); printf '%s' "(${Chars[*]@Q})" ;}
string=$(printf '%b' "apa\nbepa")
declare -a MyString=$(str2arr "$string")
declare -p MyString
# prints declare -a MyString=([0]="a" [1]="p" [2]="a" [3]=$'\n' [4]="b" [5]="e" [6]="p" [7]="a")

As a response to Alexandro de Oliveira, I think the following is more elegant or at least more intuitive:

while read -r -n1 c ; do arr+=("$c") ; done <<<"hejsan"

Solution 16 - String

zsh solution: To put the scalar string variable into arr, which will be an array:

arr=(${(ps::)string})

Solution 17 - String

declare -r some_string='abcdefghijklmnopqrstuvwxyz'
declare -a some_array
declare -i idx

for ((idx = 0; idx < ${#some_string}; ++idx)); do
  some_array+=("${some_string:idx:1}")
done

for idx in "${!some_array[@]}"; do
  echo "$((idx)): ${some_array[idx]}"
done

Solution 18 - String

I know this is a "bash" question, but please let me show you the perfect solution in zsh, a shell very popular these days:

string='this is a string'
string_array=(${(s::)string})  #Parameter expansion. And that's it!

print ${(t)string_array}  -> type array
print $#string_array -> 16 items

Solution 19 - String

Pure bash, no loop.

Another solution, similar to/adapted from Léa Gris' solution, but using read -a instead of readarray/mapfile :

#!/usr/bin/env bash

str='azerty'

# Need extglob for the replacement pattern
shopt -s extglob

# Split string characters into array
# ${str//?()/$'\x1F'} replace each character "c" with "^_c".
# ^_ (Control-_, 0x1f) is Unit Separator (US), you can choose another
# character.
IFS=$'\x1F' read -ra array <<< "${str//?()/$'\x1F'}"

# now, array[0] contains an empty string and the rest of array (starting
# from index 1) contains the original string characters :
declare -p array

# Or, if you prefer to keep the array "clean", you can delete
# the first element and pack the array :
unset array[0]
array=("${array[@]}")
declare -p array

However, I prefer the shorter (and easier to understand for me), where we remove the initial 0x1f before assigning the array :

#!/usr/bin/env bash

str='azerty'
shopt -s extglob

tmp="${str//?()/$'\x1F'}"              # same as code above
tmp=${tmp#$'\x1F'}                     # remove initial 0x1f
IFS=$'\x1F' read -ra array <<< "$tmp"  # assign array

declare -p array                       # verification

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionn sView Question on Stackoverflow
Solution 1 - StringxdazzView Answer on Stackoverflow
Solution 2 - StringMatView Answer on Stackoverflow
Solution 3 - Stringmr.spuraticView Answer on Stackoverflow
Solution 4 - StringVaughn CatoView Answer on Stackoverflow
Solution 5 - StringAlexandro de OliveiraView Answer on Stackoverflow
Solution 6 - StringAZAhmedView Answer on Stackoverflow
Solution 7 - StringLéa GrisView Answer on Stackoverflow
Solution 8 - String0x00View Answer on Stackoverflow
Solution 9 - StringKaroly HorvathView Answer on Stackoverflow
Solution 10 - StringZomboView Answer on Stackoverflow
Solution 11 - StringPhiView Answer on Stackoverflow
Solution 12 - Stringl0b0View Answer on Stackoverflow
Solution 13 - StringTony XuView Answer on Stackoverflow
Solution 14 - StringDennisView Answer on Stackoverflow
Solution 15 - Stringmethuselah-0View Answer on Stackoverflow
Solution 16 - StringEd GrimmView Answer on Stackoverflow
Solution 17 - StringAndrej PodzimekView Answer on Stackoverflow
Solution 18 - StringFrat QuinteroView Answer on Stackoverflow
Solution 19 - StringBrunoView Answer on Stackoverflow