Uint8Array to string in Javascript

Javascript

Javascript Problem Overview


I have some UTF-8 encoded data living in a range of Uint8Array elements in Javascript. Is there an efficient way to decode these out to a regular javascript string (I believe Javascript uses 16 bit Unicode)? I dont want to add one character at the time as the string concaternation would become to CPU intensive.

Javascript Solutions


Solution 1 - Javascript

TextEncoder and TextDecoder from the Encoding standard, which is polyfilled by the stringencoding library, converts between strings and ArrayBuffers:

var uint8array = new TextEncoder().encode("¢");
var string = new TextDecoder().decode(uint8array);

Solution 2 - Javascript

This should work:

// http://www.onicos.com/staff/iz/amuse/javascript/expert/utf.txt

/* utf.js - UTF-8 <=> UTF-16 convertion
 *
 * Copyright (C) 1999 Masanao Izumo <iz@onicos.co.jp>
 * Version: 1.0
 * LastModified: Dec 25 1999
 * This library is free.  You can redistribute it and/or modify it.
 */

function Utf8ArrayToStr(array) {
    var out, i, len, c;
    var char2, char3;

    out = "";
    len = array.length;
    i = 0;
    while(i < len) {
	c = array[i++];
	switch(c >> 4)
	{ 
	  case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
	    // 0xxxxxxx
	    out += String.fromCharCode(c);
	    break;
	  case 12: case 13:
	    // 110x xxxx   10xx xxxx
	    char2 = array[i++];
	    out += String.fromCharCode(((c & 0x1F) << 6) | (char2 & 0x3F));
	    break;
	  case 14:
	    // 1110 xxxx  10xx xxxx  10xx xxxx
        char2 = array[i++];
	    char3 = array[i++];
	    out += String.fromCharCode(((c & 0x0F) << 12) |
					   ((char2 & 0x3F) << 6) |
					   ((char3 & 0x3F) << 0));
	    break;
	}
    }

    return out;
}

It's somewhat cleaner as the other solutions because it doesn't use any hacks nor depends on Browser JS functions, e.g. works also in other JS environments.

Check out the JSFiddle demo.

Also see the related questions: here and here

Solution 3 - Javascript

Here's what I use:

var str = String.fromCharCode.apply(null, uint8Arr);

Solution 4 - Javascript

Solution 5 - Javascript

Found in one of the Chrome sample applications, although this is meant for larger blocks of data where you're okay with an asynchronous conversion.

/**
 * Converts an array buffer to a string
 *
 * @private
 * @param {ArrayBuffer} buf The buffer to convert
 * @param {Function} callback The function to call when conversion is complete
 */
function _arrayBufferToString(buf, callback) {
  var bb = new Blob([new Uint8Array(buf)]);
  var f = new FileReader();
  f.onload = function(e) {
    callback(e.target.result);
  };
  f.readAsText(bb);
}

Solution 6 - Javascript

The solution given by Albert works well as long as the provided function is invoked infrequently and is only used for arrays of modest size, otherwise it is egregiously inefficient. Here is an enhanced vanilla JavaScript solution that works for both Node and browsers and has the following advantages:

• Works efficiently for all octet array sizes

• Generates no intermediate throw-away strings

• Supports 4-byte characters on modern JS engines (otherwise "?" is substituted)

var utf8ArrayToStr = (function () {
    var charCache = new Array(128);  // Preallocate the cache for the common single byte chars
    var charFromCodePt = String.fromCodePoint || String.fromCharCode;
    var result = [];

    return function (array) {
        var codePt, byte1;
        var buffLen = array.length;

        result.length = 0;

        for (var i = 0; i < buffLen;) {
            byte1 = array[i++];

            if (byte1 <= 0x7F) {
                codePt = byte1;
            } else if (byte1 <= 0xDF) {
                codePt = ((byte1 & 0x1F) << 6) | (array[i++] & 0x3F);
            } else if (byte1 <= 0xEF) {
                codePt = ((byte1 & 0x0F) << 12) | ((array[i++] & 0x3F) << 6) | (array[i++] & 0x3F);
            } else if (String.fromCodePoint) {
                codePt = ((byte1 & 0x07) << 18) | ((array[i++] & 0x3F) << 12) | ((array[i++] & 0x3F) << 6) | (array[i++] & 0x3F);
            } else {
                codePt = 63;    // Cannot convert four byte code points, so use "?" instead
                i += 3;
            }

            result.push(charCache[codePt] || (charCache[codePt] = charFromCodePt(codePt)));
        }

        return result.join('');
    };
})();

Solution 7 - Javascript

In NodeJS, we have Buffers available, and string conversion with them is really easy. Better, it's easy to convert a Uint8Array to a Buffer. Try this code, it's worked for me in Node for basically any conversion involving Uint8Arrays:

let str = Buffer.from(uint8arr.buffer).toString();

We're just extracting the ArrayBuffer from the Uint8Array and then converting that to a proper NodeJS Buffer. Then we convert the Buffer to a string (you can throw in a hex or base64 encoding if you want).

If we want to convert back to a Uint8Array from a string, then we'd do this:

let uint8arr = new Uint8Array(Buffer.from(str));

Be aware that if you declared an encoding like base64 when converting to a string, then you'd have to use Buffer.from(str, "base64") if you used base64, or whatever other encoding you used.

This will not work in the browser without a module! NodeJS Buffers just don't exist in the browser, so this method won't work unless you add Buffer functionality to the browser. That's actually pretty easy to do though, just use a module like this, which is both small and fast!

Solution 8 - Javascript

I was frustrated to see that people were not showing how to go both ways or showing that things work on none trivial UTF8 strings. I found a post on codereview.stackexchange.com that has some code that works well. I used it to turn ancient runes into bytes, to test some crypo on the bytes, then convert things back into a string. The working code is on github here. I renamed the methods for clarity:

// https://codereview.stackexchange.com/a/3589/75693
function bytesToSring(bytes) {
    var chars = [];
    for(var i = 0, n = bytes.length; i < n;) {
        chars.push(((bytes[i++] & 0xff) << 8) | (bytes[i++] & 0xff));
    }
    return String.fromCharCode.apply(null, chars);
}

// https://codereview.stackexchange.com/a/3589/75693
function stringToBytes(str) {
    var bytes = [];
    for(var i = 0, n = str.length; i < n; i++) {
        var char = str.charCodeAt(i);
        bytes.push(char >>> 8, char & 0xFF);
    }
    return bytes;
}

The unit test uses this UTF-8 string:

    // http://kermitproject.org/utf8.html
    // From the Anglo-Saxon Rune Poem (Rune version) 
    const secretUtf8 = `ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ
ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ
ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬`;

Note that the string length is only 117 characters but the byte length, when encoded, is 234.

If I uncomment the console.log lines I can see that the string that is decoded is the same string that was encoded (with the bytes passed through Shamir's secret sharing algorithm!):

unit test that demos encoding and decoding

Solution 9 - Javascript

Do what @Sudhir said, and then to get a String out of the comma seperated list of numbers use:

for (var i=0; i<unitArr.byteLength; i++) {
			myString += String.fromCharCode(unitArr[i])
		}

This will give you the string you want, if it's still relevant

Solution 10 - Javascript

Uint8Array to String

let str = Buffer.from(key.secretKey).toString('base64');

String to Uint8Array

let uint8arr = new Uint8Array(Buffer.from(data,'base64')); 

Solution 11 - Javascript

If you can't use the TextDecoder API because it is not supported on IE:

  1. You can use the FastestSmallestTextEncoderDecoder polyfill recommended by the Mozilla Developer Network website;
  2. You can use this function also provided at the MDN website:

function utf8ArrayToString(aBytes) {
    var sView = "";
    
    for (var nPart, nLen = aBytes.length, nIdx = 0; nIdx < nLen; nIdx++) {
        nPart = aBytes[nIdx];
        
        sView += String.fromCharCode(
            nPart > 251 && nPart < 254 && nIdx + 5 < nLen ? /* six bytes */
                /* (nPart - 252 << 30) may be not so safe in ECMAScript! So...: */
                (nPart - 252) * 1073741824 + (aBytes[++nIdx] - 128 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 247 && nPart < 252 && nIdx + 4 < nLen ? /* five bytes */
                (nPart - 248 << 24) + (aBytes[++nIdx] - 128 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 239 && nPart < 248 && nIdx + 3 < nLen ? /* four bytes */
                (nPart - 240 << 18) + (aBytes[++nIdx] - 128 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 223 && nPart < 240 && nIdx + 2 < nLen ? /* three bytes */
                (nPart - 224 << 12) + (aBytes[++nIdx] - 128 << 6) + aBytes[++nIdx] - 128
            : nPart > 191 && nPart < 224 && nIdx + 1 < nLen ? /* two bytes */
                (nPart - 192 << 6) + aBytes[++nIdx] - 128
            : /* nPart < 127 ? */ /* one byte */
                nPart
        );
    }
    
    return sView;
}

let str = utf8ArrayToString([50,72,226,130,130,32,43,32,79,226,130,130,32,226,135,140,32,50,72,226,130,130,79]);

// Must show 2H₂ + O₂ ⇌ 2H₂O
console.log(str);

Solution 12 - Javascript

Try these functions,

var JsonToArray = function(json)
{
	var str = JSON.stringify(json, null, 0);
	var ret = new Uint8Array(str.length);
	for (var i = 0; i < str.length; i++) {
		ret[i] = str.charCodeAt(i);
	}
	return ret
};

var binArrayToJson = function(binArray)
{
	var str = "";
	for (var i = 0; i < binArray.length; i++) {
		str += String.fromCharCode(parseInt(binArray[i]));
	}
	return JSON.parse(str)
}

source: https://gist.github.com/tomfa/706d10fed78c497731ac, kudos to Tomfa

Solution 13 - Javascript

For ES6 and UTF8 string

decodeURIComponent(escape(String.fromCharCode(...uint8arrData)))

Solution 14 - Javascript

class UTF8{
static encode(str:string){return new UTF8().encode(str)}
static decode(data:Uint8Array){return new UTF8().decode(data)}

private EOF_byte:number = -1;
private EOF_code_point:number = -1;
private encoderError(code_point) {
	console.error("UTF8 encoderError",code_point)
}
private decoderError(fatal, opt_code_point?):number {
	if (fatal) console.error("UTF8 decoderError",opt_code_point)
	return opt_code_point || 0xFFFD;
}
private inRange(a:number, min:number, max:number) {
	return min <= a && a <= max;
}
private div(n:number, d:number) {
	return Math.floor(n / d);
}
private stringToCodePoints(string:string) {
	/** @type {Array.<number>} */
	let cps = [];
	// Based on http://www.w3.org/TR/WebIDL/#idl-DOMString
	let i = 0, n = string.length;
	while (i < string.length) {
		let c = string.charCodeAt(i);
		if (!this.inRange(c, 0xD800, 0xDFFF)) {
			cps.push(c);
		} else if (this.inRange(c, 0xDC00, 0xDFFF)) {
			cps.push(0xFFFD);
		} else { // (inRange(c, 0xD800, 0xDBFF))
			if (i == n - 1) {
				cps.push(0xFFFD);
			} else {
				let d = string.charCodeAt(i + 1);
				if (this.inRange(d, 0xDC00, 0xDFFF)) {
					let a = c & 0x3FF;
					let b = d & 0x3FF;
					i += 1;
					cps.push(0x10000 + (a << 10) + b);
				} else {
					cps.push(0xFFFD);
				}
			}
		}
		i += 1;
	}
	return cps;
}

private encode(str:string):Uint8Array {
	let pos:number = 0;
	let codePoints = this.stringToCodePoints(str);
	let outputBytes = [];

	while (codePoints.length > pos) {
		let code_point:number = codePoints[pos++];

		if (this.inRange(code_point, 0xD800, 0xDFFF)) {
			this.encoderError(code_point);
		}
		else if (this.inRange(code_point, 0x0000, 0x007f)) {
			outputBytes.push(code_point);
		} else {
			let count = 0, offset = 0;
			if (this.inRange(code_point, 0x0080, 0x07FF)) {
				count = 1;
				offset = 0xC0;
			} else if (this.inRange(code_point, 0x0800, 0xFFFF)) {
				count = 2;
				offset = 0xE0;
			} else if (this.inRange(code_point, 0x10000, 0x10FFFF)) {
				count = 3;
				offset = 0xF0;
			}

			outputBytes.push(this.div(code_point, Math.pow(64, count)) + offset);

			while (count > 0) {
				let temp = this.div(code_point, Math.pow(64, count - 1));
				outputBytes.push(0x80 + (temp % 64));
				count -= 1;
			}
		}
	}
	return new Uint8Array(outputBytes);
}

private decode(data:Uint8Array):string {
	let fatal:boolean = false;
	let pos:number = 0;
	let result:string = "";
	let code_point:number;
	let utf8_code_point = 0;
	let utf8_bytes_needed = 0;
	let utf8_bytes_seen = 0;
	let utf8_lower_boundary = 0;

	while (data.length > pos) {
		let _byte = data[pos++];

		if (_byte == this.EOF_byte) {
			if (utf8_bytes_needed != 0) {
				code_point = this.decoderError(fatal);
			} else {
				code_point = this.EOF_code_point;
			}
		} else {
			if (utf8_bytes_needed == 0) {
				if (this.inRange(_byte, 0x00, 0x7F)) {
					code_point = _byte;
				} else {
					if (this.inRange(_byte, 0xC2, 0xDF)) {
						utf8_bytes_needed = 1;
						utf8_lower_boundary = 0x80;
						utf8_code_point = _byte - 0xC0;
					} else if (this.inRange(_byte, 0xE0, 0xEF)) {
						utf8_bytes_needed = 2;
						utf8_lower_boundary = 0x800;
						utf8_code_point = _byte - 0xE0;
					} else if (this.inRange(_byte, 0xF0, 0xF4)) {
						utf8_bytes_needed = 3;
						utf8_lower_boundary = 0x10000;
						utf8_code_point = _byte - 0xF0;
					} else {
						this.decoderError(fatal);
					}
					utf8_code_point = utf8_code_point * Math.pow(64, utf8_bytes_needed);
					code_point = null;
				}
			} else if (!this.inRange(_byte, 0x80, 0xBF)) {
				utf8_code_point = 0;
				utf8_bytes_needed = 0;
				utf8_bytes_seen = 0;
				utf8_lower_boundary = 0;
				pos--;
				code_point = this.decoderError(fatal, _byte);
			} else {
				utf8_bytes_seen += 1;
				utf8_code_point = utf8_code_point + (_byte - 0x80) * Math.pow(64, utf8_bytes_needed - utf8_bytes_seen);

				if (utf8_bytes_seen !== utf8_bytes_needed) {
					code_point = null;
				} else {
					let cp = utf8_code_point;
					let lower_boundary = utf8_lower_boundary;
					utf8_code_point = 0;
					utf8_bytes_needed = 0;
					utf8_bytes_seen = 0;
					utf8_lower_boundary = 0;
					if (this.inRange(cp, lower_boundary, 0x10FFFF) && !this.inRange(cp, 0xD800, 0xDFFF)) {
						code_point = cp;
					} else {
						code_point = this.decoderError(fatal, _byte);
					}
				}

			}
		}
		//Decode string
		if (code_point !== null && code_point !== this.EOF_code_point) {
			if (code_point <= 0xFFFF) {
				if (code_point > 0)result += String.fromCharCode(code_point);
			} else {
				code_point -= 0x10000;
				result += String.fromCharCode(0xD800 + ((code_point >> 10) & 0x3ff));
				result += String.fromCharCode(0xDC00 + (code_point & 0x3ff));
			}
		}
	}
	return result;
}

`

Solution 15 - Javascript

By far the easiest way that has worked for me is:


//1. Create or fetch the Uint8Array to use in the example
const bufferArray = new Uint8Array([10, 10, 10])

//2. Turn the Uint8Array into a regular array
const array = Array.from(bufferArray);

//3. Stringify it (option A)
JSON.stringify(array);


//3. Stringify it (option B: uses @serdarsenay code snippet to decode each item in array)
let binArrayToString = function(binArray) {
    let str = "";
    for (let i = 0; i < binArray.length; i++) {        
        str += String.fromCharCode(parseInt(binArray[i]));
    }
    return str;
}

binArrayToString(array);

Solution 16 - Javascript

Using base64 as the encoding format works quite well. This is how it was implemented for passing secrets via urls in Firefox Send. You will need the base64-js package. These are the functions from the Send source code:

const b64 = require("base64-js")

function arrayToB64(array) {
  return b64.fromByteArray(array).replace(/\+/g, "-").replace(/\//g, "_").replace(/=/g, "")
}

function b64ToArray(str) {
  return b64.toByteArray(str + "===".slice((str.length + 3) % 4))
}

Solution 17 - Javascript

With vanilla, browser side, recording from microphone, base64 functions worked for me (I had to implement an audio sending function to a chat).

      const ui8a =  new Uint8Array(e.target.result);
      const string = btoa(ui8a);
      const ui8a_2 = atob(string).split(',');

Full code now. Thanks to Bryan Jennings & [email protected] for the code.

https://medium.com/@bryanjenningz/how-to-record-and-play-audio-in-javascript-faa1b2b3e49b

https://www.py4u.net/discuss/282499

index.html

<html>
  <head>
    <title>Record Audio Test</title>
    <meta name="encoding" charset="utf-8" />
  </head>
  <body>
    <h1>Audio Recording Test</h1>
    <script src="index.js"></script>
    <button id="action" onclick="start()">Start</button>
    <button id="stop" onclick="stop()">Stop</button>
    <button id="play" onclick="play()">Listen</button>
  </body>
</html>

index.js:

const recordAudio = () =>
  new Promise(async resolve => {
    const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
    const mediaRecorder = new MediaRecorder(stream);
    const audioChunks = [];

    mediaRecorder.addEventListener("dataavailable", event => {
      audioChunks.push(event.data);
    });

    const start = () => mediaRecorder.start();

    const stop = () =>
      new Promise(resolve => {
        mediaRecorder.addEventListener("stop", () => {
          const audioBlob = new Blob(audioChunks);
          const audioUrl = URL.createObjectURL(audioBlob);
          const audio = new Audio(audioUrl);
          const play = () => audio.play();
          resolve({ audioBlob, audioUrl, play });
        });

        mediaRecorder.stop();
      });

    resolve({ start, stop });
  });

let recorder = null;
let audio = null;
const sleep = time => new Promise(resolve => setTimeout(resolve, time));

const start = async () => {
  recorder = await recordAudio();
  recorder.start();
}

const stop = async () => {
  audio = await recorder.stop();
  read(audio.audioUrl);
}

const play = ()=> {
  audio.play();
}

const read = (blobUrl)=> {

  var xhr = new XMLHttpRequest;
  xhr.responseType = 'blob';
  
  xhr.onload = function() {
      var recoveredBlob = xhr.response;
      const reader = new FileReader();
      // This fires after the blob has been read/loaded.
      reader.addEventListener('loadend', (e) => {

          const ui8a =  new Uint8Array(e.target.result);
          const string = btoa(ui8a);
          const ui8a_2 = atob(string).split(',');
          
          playByteArray(ui8a_2);
      });
      // Start reading the blob as text.
      reader.readAsArrayBuffer(recoveredBlob);
  };
  // get the blob through blob url 
  xhr.open('GET', blobUrl);
  xhr.send();
}

window.onload = init;
var context;    // Audio context
var buf;        // Audio buffer

function init() {
  if (!window.AudioContext) {
      if (!window.webkitAudioContext) {
          alert("Your browser does not support any AudioContext and cannot play back this audio.");
          return;
      }
        window.AudioContext = window.webkitAudioContext;
    }

    context = new AudioContext();
}

function playByteArray(byteArray) {

    var arrayBuffer = new ArrayBuffer(byteArray.length);
    var bufferView = new Uint8Array(arrayBuffer);
    for (i = 0; i < byteArray.length; i++) {
      bufferView[i] = byteArray[i];
    }

    context.decodeAudioData(arrayBuffer, function(buffer) {
        buf = buffer;
        play2();
    });
}

// Play the loaded file
function play2() {
    // Create a source node from the buffer
    var source = context.createBufferSource();
    source.buffer = buf;
    // Connect to the final output node (the speakers)
    source.connect(context.destination);
    // Play immediately
    source.start(0);
}

Solution 18 - Javascript

I'm using this function, which works for me:

function uint8ArrayToBase64(data) {
	return btoa(Array.from(data).map((c) => String.fromCharCode(c)).join(''));
}

Solution 19 - Javascript

I am using this Typescript snippet:

function UInt8ArrayToString(uInt8Array: Uint8Array): string
{
    var s: string = "[";
    for(var i: number = 0; i < uInt8Array.byteLength; i++)
    {
        if( i > 0 )
            s += ", ";
        s += uInt8Array[i];
    }
    s += "]";
    return s;
}

Remove the type annotations if you need the JavaScript version. Hope this helps!

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJack WesterView Question on Stackoverflow
Solution 1 - JavascriptVincent ScheibView Answer on Stackoverflow
Solution 2 - JavascriptAlbertView Answer on Stackoverflow
Solution 3 - JavascriptdlchambersView Answer on Stackoverflow
Solution 4 - JavascriptkpowzView Answer on Stackoverflow
Solution 5 - JavascriptWill ScottView Answer on Stackoverflow
Solution 6 - JavascriptBob ArlofView Answer on Stackoverflow
Solution 7 - Javascriptarctic_hen7View Answer on Stackoverflow
Solution 8 - Javascriptsimbo1905View Answer on Stackoverflow
Solution 9 - JavascriptshukiView Answer on Stackoverflow
Solution 10 - JavascriptMaddu SwaroopView Answer on Stackoverflow
Solution 11 - JavascriptRosberg LinharesView Answer on Stackoverflow
Solution 12 - JavascriptsedView Answer on Stackoverflow
Solution 13 - JavascriptHelgi KroppView Answer on Stackoverflow
Solution 14 - JavascriptterranView Answer on Stackoverflow
Solution 15 - JavascriptFranco PetraView Answer on Stackoverflow
Solution 16 - JavascriptCarterView Answer on Stackoverflow
Solution 17 - JavascriptestornesView Answer on Stackoverflow
Solution 18 - JavascriptArtisan72View Answer on Stackoverflow
Solution 19 - JavascriptBernd ParadiesView Answer on Stackoverflow