Byte and char conversion in Java

Java Problem Overview

If I convert a character to byte and then back to char, that character mysteriously disappears and becomes something else. How is this possible?

This is the code:

char a = 'È';       // line 1		
byte b = (byte)a;   // line 2 		
char c = (char)b;   // line 3
System.out.println((char)c + " " + (int)c);

Until line 2 everything is fine:

In line 1 I could print "a" in the console and it would show "È".
In line 2 I could print "b" in the console and it would show -56, that is 200 because byte is signed. And 200 is "È". So it's still fine.

But what's wrong in line 3? "c" becomes something else and the program prints ? 65480. That's something completely different.

What I should write in line 3 in order to get the correct result?

Java Solutions

Solution 1 - Java

A character in Java is a Unicode code-unit which is treated as an unsigned number. So if you perform c = (char)b the value you get is 2^16 - 56 or 65536 - 56.

Or more precisely, the byte is first converted to a signed integer with the value 0xFFFFFFC8 using sign extension in a widening conversion. This in turn is then narrowed down to 0xFFC8 when casting to a char, which translates to the positive number 65480.

From the language specification:

5.1.4. Widening and Narrowing Primitive Conversion

> First, the byte is converted to an int via widening primitive conversion (§5.1.2), and then the resulting int is converted to a char by narrowing primitive conversion (§5.1.3).

To get the right point use char c = (char) (b & 0xFF) which first converts the byte value of b to the positive integer 200 by using a mask, zeroing the top 24 bits after conversion: 0xFFFFFFC8 becomes 0x000000C8 or the positive number 200 in decimals.

Above is a direct explanation of what happens during conversion between the byte, int and char primitive types.

If you want to encode/decode characters from bytes, use Charset, CharsetEncoder, CharsetDecoder or one of the convenience methods such as new String(byte[] bytes, Charset charset) or String#toBytes(Charset charset). You can get the character set (such as UTF-8 or Windows-1252) from StandardCharsets.

Solution 2 - Java

This worked for me: //Add import statement

import java.nio.charset.Charset;

// Change

sun.io.ByteToCharConverter.getDefault().getCharacterEncoding() -> Charset.defaultCharset()

Solution 3 - Java

new String(byteArray, Charset.defaultCharset())

This will convert a byte array to the default charset in java. It may throw exceptions depending on what you supply with the byteArray.

Content Type	Original Author	Original Content on Stackoverflow
Question	user1883212	View Question on Stackoverflow
Solution 1 - Java	Maarten Bodewes	View Answer on Stackoverflow
Solution 2 - Java	Vivek Kumar	View Answer on Stackoverflow
Solution 3 - Java	Joe	View Answer on Stackoverflow

Byte and char conversion in Java

Java Problem Overview

Java Solutions

Solution 1 - Java

Solution 2 - Java

Solution 3 - Java

Spring Scheduler stops unexpectedly

Xlib: extension "RANDR" missing on display ":21". - Trying to run headless Google Chrome

Attributions