Encode String to UTF-8

JavaUtf 8

Java Problem Overview


I have a String with a "ñ" character and I have some problems with it. I need to encode this String to UTF-8 encoding. I have tried it by this way, but it doesn't work:

byte ptext[] = myString.getBytes();
String value = new String(ptext, "UTF-8");

How do I encode that string to utf-8?

Java Solutions


Solution 1 - Java

How about using

ByteBuffer byteBuffer = StandardCharsets.UTF_8.encode(myString)

Solution 2 - Java

String objects in Java use the UTF-16 encoding that can't be modified*.

The only thing that can have a different encoding is a byte[]. So if you need UTF-8 data, then you need a byte[]. If you have a String that contains unexpected data, then the problem is at some earlier place that incorrectly converted some binary data to a String (i.e. it was using the wrong encoding).

* As a matter of implementation, String can internally use a ISO-8859-1 encoded byte[] when the range of characters fits it, but that is an implementation-specific optimization that isn't visible to users of String (i.e. you'll never notice unless you dig into the source code or use reflection to dig into a String object).

Solution 3 - Java

In Java7 you can use:

import static java.nio.charset.StandardCharsets.*;

byte[] ptext = myString.getBytes(ISO_8859_1); 
String value = new String(ptext, UTF_8); 

This has the advantage over getBytes(String) that it does not declare throws UnsupportedEncodingException.

If you're using an older Java version you can declare the charset constants yourself:

import java.nio.charset.Charset;

public class StandardCharsets {
	public static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");
	public static final Charset UTF_8 = Charset.forName("UTF-8");
	//....
}

Solution 4 - Java

Use byte[] ptext = String.getBytes("UTF-8"); instead of getBytes(). getBytes() uses so-called "default encoding", which may not be UTF-8.

Solution 5 - Java

A Java String is internally always encoded in UTF-16 - but you really should think about it like this: an encoding is a way to translate between Strings and bytes.

So if you have an encoding problem, by the time you have String, it's too late to fix. You need to fix the place where you create that String from a file, DB or network connection.

Solution 6 - Java

You can try this way.

byte ptext[] = myString.getBytes("ISO-8859-1"); 
String value = new String(ptext, "UTF-8"); 

Solution 7 - Java

In a moment I went through this problem and managed to solve it in the following way

first i need to import

import java.nio.charset.Charset;

Then i had to declare a constant to use UTF-8 and ISO-8859-1

private static final Charset UTF_8 = Charset.forName("UTF-8");
private static final Charset ISO = Charset.forName("ISO-8859-1");

Then I could use it in the following way:

String textwithaccent="Thís ís a text with accent";
String textwithletter="Ñandú";

text1 = new String(textwithaccent.getBytes(ISO), UTF_8);
text2 = new String(textwithletter.getBytes(ISO),UTF_8);

Solution 8 - Java

String value = new String(myString.getBytes("UTF-8"));

and, if you want to read from text file with "ISO-8859-1" encoded:

String line;
String f = "C:\\MyPath\\MyFile.txt";
try {
    BufferedReader br = Files.newBufferedReader(Paths.get(f), Charset.forName("ISO-8859-1"));
    while ((line = br.readLine()) != null) {
        System.out.println(new String(line.getBytes("UTF-8")));
    }
} catch (IOException ex) {
    //...
}

Solution 9 - Java

I have use below code to encode the special character by specifying encode format.

String text = "This is an example é";
byte[] byteText = text.getBytes(Charset.forName("UTF-8"));
//To get original string from byte.
String originalString= new String(byteText , "UTF-8");

Solution 10 - Java

A quick step-by-step guide how to configure NetBeans default encoding UTF-8. In result NetBeans will create all new files in UTF-8 encoding.

NetBeans default encoding UTF-8 step-by-step guide

  • Go to etc folder in NetBeans installation directory

  • Edit netbeans.conf file

  • Find netbeans_default_options line

  • Add -J-Dfile.encoding=UTF-8 inside quotation marks inside that line

    (example: netbeans_default_options="-J-Dfile.encoding=UTF-8")

  • Restart NetBeans

You set NetBeans default encoding UTF-8.

Your netbeans_default_options may contain additional parameters inside the quotation marks. In such case, add -J-Dfile.encoding=UTF-8 at the end of the string. Separate it with space from other parameters.

Example:

> netbeans_default_options="-J-client -J-Xss128m -J-Xms256m > -J-XX:PermSize=32m -J-Dapple.laf.useScreenMenuBar=true -J-Dapple.awt.graphics.UseQuartz=true -J-Dsun.java2d.noddraw=true -J-Dsun.java2d.dpiaware=true -J-Dsun.zip.disableMemoryMapping=true -J-Dfile.encoding=UTF-8"

here is link for Further Details

Solution 11 - Java

This solved my problem

    String inputText = "some text with escaped chars"
	InputStream is = new ByteArrayInputStream(inputText.getBytes("UTF-8"));

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlexView Question on Stackoverflow
Solution 1 - JavaAmir RachumView Answer on Stackoverflow
Solution 2 - JavaJoachim SauerView Answer on Stackoverflow
Solution 3 - JavarzymekView Answer on Stackoverflow
Solution 4 - JavaPeter ŠtibranýView Answer on Stackoverflow
Solution 5 - JavaMichael BorgwardtView Answer on Stackoverflow
Solution 6 - Javauser716840View Answer on Stackoverflow
Solution 7 - JavaQuimboView Answer on Stackoverflow
Solution 8 - JavafedesanpView Answer on Stackoverflow
Solution 9 - Javalaxman954View Answer on Stackoverflow
Solution 10 - JavaMr. Laeeq KhanView Answer on Stackoverflow
Solution 11 - JavaPrasanth RJView Answer on Stackoverflow