How do I split a string into an array of characters?
JavascriptStringJavascript Problem Overview
var s = "overpopulation";
var ar = [];
ar = s.split();
alert(ar);
I want to string.split a word into array of characters.
The above code doesn't seem to work - it returns "overpopulation" as Object..
How do i split it into array of characters, if original string doesn't contain commas and whitespace?
Javascript Solutions
Solution 1 - Javascript
You can split on an empty string:
var chars = "overpopulation".split('');
If you just want to access a string in an array-like fashion, you can do that without split
:
var s = "overpopulation";
for (var i = 0; i < s.length; i++) {
console.log(s.charAt(i));
}
You can also access each character with its index using normal array syntax. Note, however, that strings are immutable, which means you can't set the value of a character using this method, and that it isn't supported by IE7 (if that still matters to you).
var s = "overpopulation";
console.log(s[3]); // logs 'r'
Solution 2 - Javascript
Old question but I should warn:
.split('')
Do NOT use You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.
Reason is that methods like .split()
and .charCodeAt()
only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.
'๐๐๐'.length // โ> 6
'๐๐๐'.split('') // โ> ["๏ฟฝ", "๏ฟฝ", "๏ฟฝ", "๏ฟฝ", "๏ฟฝ", "๏ฟฝ"]
'๐'.length // โ> 2
'๐'.split('') // โ> ["๏ฟฝ", "๏ฟฝ"]
Use ES2015 (ES6) features where possible:
Using the spread operator:
let arr = [...str];
Or Array.from
let arr = Array.from(str);
Or split
with the new u
RegExp flag:
let arr = str.split(/(?!$)/u);
Examples:
[...'๐๐๐'] // โ> ["๐", "๐", "๐"]
[...'๐๐๐'] // โ> ["๐", "๐", "๐"]
For ES5, options are limited:
I came up with this function that internally uses MDN example to get the correct code point of each character.
function stringToArray() {
var i = 0,
arr = [],
codePoint;
while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
arr.push(String.fromCodePoint(codePoint));
i++;
}
return arr;
}
This requires knownCharCodeAt()
function and for some browsers; a String.fromCodePoint()
polyfill.
if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , ยฉ 2012 Steven Levithan , MIT License
String.fromCodePoint = function fromCodePoint () {
var chars = [], point, offset, units, i;
for (i = 0; i < arguments.length; ++i) {
point = arguments[i];
offset = point - 0x10000;
units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
chars.push(String.fromCharCode.apply(null, units));
}
return chars.join("");
}
}
Examples:
stringToArray('๐๐๐') // โ> ["๐", "๐", "๐"]
stringToArray('๐๐๐') // โ> ["๐", "๐", "๐"]
Note: str[index]
(ES5) and str.charAt(index)
will also return weird results with non-BMP charsets. e.g. '๐'.charAt(0)
returns "๏ฟฝ"
.
UPDATE: Read this nice article about JS and unicode.
Solution 3 - Javascript
.split('')
splits emojis in half.
Onur's solutions work for some emojis, but can't handle more complex languages or combined emojis.
Consider this emoji being ruined:
[..."๐ณ๏ธโ๐"] // returns ["๐ณ", "๏ธ", "โ", "๐"] instead of ["๐ณ๏ธโ๐"]
Also consider this Hindi text เค
เคจเฅเคเฅเคเฅเคฆ
which is split like this:
[..."เค
เคจเฅเคเฅเคเฅเคฆ"] // returns ["เค
", "เคจ", "เฅ", "เค", "เฅ", "เค", "เฅ", "เคฆ"]
but should in fact be split like this:
["เค
","เคจเฅ","เคเฅ","เคเฅ","เคฆ"]
This happens because some of the characters are combining marks (think diacritics/accents in European languages).
You can use the grapheme-splitter library for this:
It does proper standards-based letter split in all the hundreds of exotic edge-cases - yes, there are that many.
Solution 4 - Javascript
It's as simple as:
s.split("");
The delimiter is an empty string, hence it will break up between each single character.
Solution 5 - Javascript
The split() method in javascript accepts two parameters: a separator and a limit. The separator specifies the character to use for splitting the string. If you don't specify a separator, the entire string is returned, non-separated. But, if you specify the empty string as a separator, the string is split between each character.
Therefore:
s.split('')
will have the effect you seek.
More information here
Solution 6 - Javascript
A string in Javascript is already a character array.
You can simply access any character in the array as you would any other array.
var s = "overpopulation";
alert(s[0]) // alerts o.
UPDATE
As is pointed out in the comments below, the above method for accessing a character in a string is part of ECMAScript 5 which certain browsers may not conform to.
An alternative method you can use is charAt(index)
.
var s = "overpopulation";
alert(s.charAt(0)) // alerts o.
Solution 7 - Javascript
To support emojis use this
('Dragon ๐').split(/(?!$)/u);
=> ['D', 'r', 'a', 'g', 'o', 'n', ' ', '๐']
Solution 8 - Javascript
You can use the regular expression /(?!$)/
:
"overpopulation".split(/(?!$)/)
The negative look-ahead assertion (?!$)
will match right in front of every character.