What's the difference between UTF8/UTF16 and Base64 in terms of encoding

C#EncodingUtf 8Base64

C# Problem Overview


In. c#

We can use below classes to do encoding:

  • System.Text.Encoding.UTF8
  • System.Text.Encoding.UTF16
  • System.Text.Encoding.ASCII

Why there is no System.Text.Encoding.Base64?

We can only use Convert.From(To)Base64String method, what's special of base64?

Can I say base64 is the same encoding method as UTF-8? Or UTF-8 is one of base64?

C# Solutions


Solution 1 - C#

UTF-8 and UTF-16 are methods to encode Unicode strings to byte sequences.

See: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Base64 is a method to encode a byte sequence to a string.

So, these are widely different concepts and should not be confused.

Things to keep in mind:

  • Not every byte sequence represents an Unicode string encoded in UTF-8 or UTF-16.

  • Not every Unicode string represents a byte sequence encoded in Base64.

Solution 2 - C#

Base64 is a way to encode binary data, while UTF8 and UTF16 are ways to encode Unicode text. Note that in a language like Python 2.x, where binary data and strings are mixed, you can encode strings into base64 or utf8 the same way:

u'abc'.encode('utf16')
u'abc'.encode('base64')

But in languages where there's a more well-defined separation between the two types of data, the two ways of representing data generally have quite different utilities, to keep the concerns separate.

Solution 3 - C#

UTF-8 is like the other UTF encodings a character encoding to encode characters of the Unicode character set UCS.

Base64 is an encoding to represent any byte sequence by a sequence of printable characters (i.e. AZ, az, 09, +, and /).

There is no System.Text.Encoding.Base64 because Base64 is not a text encoding but rather a base conversion like the hexadecimal that uses 09 and AF (or af) to represent numbers.

Solution 4 - C#

Simply speaking, a charcter enconding, like UTF8 , or UTF16 are useful for to match numbers, i.e. bytes to characters and viceversa, for example in ASCII 65 is matched to "A" , while a base encoding is used mainly to translate bytes to bytes so that the resulting bytes converted from a single byte are printable and are a subset of the ASCII charachter encoding, for that reason you can see Base64 also as a bytes to text encoding mechanism. The main reason to use Base64 is to be trasmit data over a channel that doesn't allow binary data transfer. That said, now it should be clear that you can have a stream encoded in Base64 that rapresent a stream UTF8 encoded.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionZhongminView Question on Stackoverflow
Solution 1 - C#dtbView Answer on Stackoverflow
Solution 2 - C#Mike AxiakView Answer on Stackoverflow
Solution 3 - C#GumboView Answer on Stackoverflow
Solution 4 - C#S.BozzoniView Answer on Stackoverflow