Why are hexadecimal numbers prefixed with 0x?

CSyntaxHex

C Problem Overview


Why are hexadecimal numbers prefixed as 0x? I understand the usage of the prefix but I don't understand the significance of why 0x was chosen.

C Solutions


Solution 1 - C

Short story: The 0 tells the parser it's dealing with a constant (and not an identifier/reserved word). Something is still needed to specify the number base: the x is an arbitrary choice.

Long story: In the 60's, the prevalent programming number systems were decimal and octal — mainframes had 12, 24 or 36 bits per byte, which is nicely divisible by 3 = log2(8).

The BCPL language used the syntax 8 1234 for octal numbers. When Ken Thompson created B from BCPL, he used the 0 prefix instead. This is great because

  1. an integer constant now always consists of a single token,
  2. the parser can still tell right away it's got a constant,
  3. the parser can immediately tell the base (0 is the same in both bases),
  4. it's mathematically sane (00005 == 05), and
  5. no precious special characters are needed (as in #123).

When C was created from B, the need for hexadecimal numbers arose (the PDP-11 had 16-bit words) and all of the points above were still valid. Since octals were still needed for other machines, 0x was arbitrarily chosen (00 was probably ruled out as awkward).

C# is a descendant of C, so it inherits the syntax.

Solution 2 - C

Note: I don't know the correct answer, but the below is just my personal speculation!

As has been mentioned a 0 before a number means it's octal:

04524 // octal, leading 0

Imagine needing to come up with a system to denote hexadecimal numbers, and note we're working in a C style environment. How about ending with h like assembly? Unfortunately you can't - it would allow you to make tokens which are valid identifiers (eg. you could name a variable the same thing) which would make for some nasty ambiguities.

8000h // hex
FF00h // oops - valid identifier!  Hex or a variable or type named FF00h?

You can't lead with a character for the same reason:

xFF00 // also valid identifier

Using a hash was probably thrown out because it conflicts with the preprocessor:

#define ...
#FF00 // invalid preprocessor token?

In the end, for whatever reason, they decided to put an x after a leading 0 to denote hexadecimal. It is unambiguous since it still starts with a number character so can't be a valid identifier, and is probably based off the octal convention of a leading 0.

0xFF00 // definitely not an identifier!

Solution 3 - C

It's a prefix to indicate the number is in hexadecimal rather than in some other base. The C programming language uses it to tell compiler.

Example:

0x6400 translates to 6*16^3 + 4*16^2 + 0*16^1 +0*16^0 = 25600. When compiler reads 0x6400, It understands the number is hexadecimal with the help of 0x term. Usually we can understand by (6400)16 or (6400)8 or whatever ..

For binary it would be:

0b00000001

Hope I have helped in some way.

Good day!

Solution 4 - C

The preceding 0 is used to indicate a number in base 2, 8, or 16.

In my opinion, 0x was chosen to indicate hex because 'x' sounds like hex.

Just my opinion, but I think it makes sense.

Good Day!

Solution 5 - C

I don't know the historical reasons behind 0x as a prefix to denote hexadecimal numbers - as it certainly could have taken many forms. This particular prefix style is from the early days of computer science.

As we are used to decimal numbers there is usually no need to indicate the base/radix. However, for programming purposes we often need to distinguish the bases from binary (base-2), octal (base-8), decimal (base-10) and hexadecimal (base-16) - as the most commonly used number bases.

At this point in time it is a convention used to denote the base of a number. I've written the number 29 in all of the above bases with their prefixes:

  • 0b11101: Binary
  • 0o35: Octal, denoted by an o
  • 0d29: Decimal, this is unusual because we assume numbers without a prefix are decimal
  • 0x1D: Hexadecimal

Basically, an alphabet we most commonly associate with a base (e.g. b for binary) is combined with 0 to easily distinguish a number's base.

This is especially helpful because smaller numbers can confusingly appear the same in all the bases: 0b1, 0o1, 0d1, 0x1.

If you were using a rich text editor though, you could alternatively use subscript to denote bases: 12, 18, 110, 116

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionunj2View Question on Stackoverflow
Solution 1 - CŘrřolaView Answer on Stackoverflow
Solution 2 - CAshleysBrainView Answer on Stackoverflow
Solution 3 - CloyolaView Answer on Stackoverflow
Solution 4 - CJohnny LowView Answer on Stackoverflow
Solution 5 - CAdvait JunnarkarView Answer on Stackoverflow