Nodejs convert string into UTF-8
node.jsUtf 8node.js Problem Overview
From my DB im getting the following string:
Johan Öbert
What it should say is:
Johan Öbert
I've tried to convert it into utf-8 like so:
nameString.toString("utf8");
But still same problem.
Any ideas?
node.js Solutions
Solution 1 - node.js
I'd recommend using the Buffer
object:
var someEncodedString = Buffer.from('someString', 'utf-8').toString();
This avoids any unnecessary dependencies that other answers require, since Buffer
is included with node.js
, and is already defined in the global scope.
Solution 2 - node.js
Use the utf8 module from npm to encode/decode the string.
Installation:
npm install utf8
In a browser:
<script src="utf8.js"></script>
In Node.js:
const utf8 = require('utf8');
API:
Encode:
utf8.encode(string)
Encodes any given JavaScript string (string) as UTF-8, and returns the UTF-8-encoded version of the string. It throws an error if the input string contains a non-scalar value, i.e. a lone surrogate. (If you need to be able to encode non-scalar values as well, use WTF-8 instead.)
// U+00A9 COPYRIGHT SIGN; see http://codepoints.net/U+00A9
utf8.encode('\xA9');
// → '\xC2\xA9'
// U+10001 LINEAR B SYLLABLE B038 E; see http://codepoints.net/U+10001
utf8.encode('\uD800\uDC01');
// → '\xF0\x90\x80\x81'
Decode:
utf8.decode(byteString)
Decodes any given UTF-8-encoded string (byteString) as UTF-8, and returns the UTF-8-decoded version of the string. It throws an error when malformed UTF-8 is detected. (If you need to be able to decode encoded non-scalar values as well, use WTF-8 instead.)
utf8.decode('\xC2\xA9');
// → '\xA9'
utf8.decode('\xF0\x90\x80\x81');
// → '\uD800\uDC01'
// → U+10001 LINEAR B SYLLABLE B038 E
Solution 3 - node.js
I had the same problem, when i loaded a text file via fs.readFile()
, I tried to set the encodeing to UTF8, it keeped the same. my solution now is this:
myString = JSON.parse( JSON.stringify( myString ) )
after this an Ö is realy interpreted as an Ö.
Solution 4 - node.js
When you want to change the encoding you always go from one into another. So you might go from Mac Roman
to UTF-8
or from ASCII
to UTF-8
.
It's as important to know the desired output encoding as the current source encoding. For example if you have Mac Roman
and you decode it from UTF-16
to UTF-8
you'll just make it garbled.
If you want to know more about encoding this article goes into a lot of details:
The npm pacakge encoding which uses node-iconv or iconv-lite should allow you to easily specify which source and output encoding you want:
var resultBuffer = encoding.convert(nameString, 'ASCII', 'UTF-8');
Solution 5 - node.js
You should be setting the database connection's charset, instead of fighting it inside nodejs:
SET NAMES 'utf8';
(works at least in MySQL and PostgreSQL)
Keep in mind you need to run that for every connection. If you're using a connection pool, do it with an event handler, eg.:
mysqlPool.on('connection', function (connection) {
connection.query("SET NAMES 'utf8'")
});
https://dev.mysql.com/doc/refman/8.0/en/charset-connection.html#charset-connection-client-configuration https://www.postgresql.org/docs/current/multibyte.html#id-1.6.10.5.7 https://www.npmjs.com/package/mysql#connection
Solution 6 - node.js
Just add this <?xml version="1.0" encoding="UTF-8"?>
, will encode. For instance, an RSS would be made with any char after adding this
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
>....
Also add to your parent layout or main app.html <meta charset="utf-8" />
<!DOCTYPE html>
<html lang="en" class="overflowhere">
<head>
<meta charset="utf-8" />
</head>
</html>