James Gosling's explanation of why Java's byte is signed
JavaTypesLanguage DesignByteJava Problem Overview
I was initially surprised that Java decides to specify that byte
is signed, with a range from -128..127
(inclusive). I'm under the impression that most 8-bit number representations are unsigned, with a range of 0..255
instead (e.g. IPv4 in dot-decimal notation).
So has James Gosling ever been asked to explain why he decided that byte
is signed? Has there been notable discussions/debates about this issue in the past between authoritative programming language designers and/or critics?
Java Solutions
Solution 1 - Java
It appears that simplicity was the main reason. From http://www.gotw.ca/publications/c_family_interview.htm">this interview:
>Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.
My initial assumption was that it's because Java doesn't have unsigned numeric types at all. Why should byte
be an exception? char
is a special case because it has to represent UTF-16 code units (thanks to Jon Skeet for the quote)
Solution 2 - Java
As per 'Oak Language Specification 0.2' aka Java language:
"The Oak byte type is what C programmers are used to thinking of as the char type. But in the Oak language, characters are 16 bits wide. Having a separate byte type removes the confusion in C between the interpretation of char as an 8 bit integer and as a character."
You can grab a postscript copy from here :
http://cretesoft.com/archive/files/OakSpec0.2.ps (partial copy on scribd)
Also there is a part of interview posted on this site: (Where he is defending the absence of unsigned byte in java)
http://www.darksleep.com/player/JavaAndUnsignedTypes.html
Adding the interview taken from the above mentioned page...
*" http://www.gotw.ca/publications/c_family_interview.htm
> Q: Programmers often talk about the advantages and disadvantages of > programming in a "simple language." What does that phrase mean to > you, and is [C/C++/Java] a simple language in your view? > > Ritchie: [deleted for brevity] > > Stroustrup: [deleted for brevity] > > Gosling: For me as a language designer, which I don't really count > myself as these days, what "simple" really ended up meaning was could > I expect J. Random Developer to hold the spec in his head. That > definition says that, for instance, Java isn't -- and in fact a lot of > these languages end up with a lot of corner cases, things that nobody > really understands. Quiz any C developer about unsigned, and pretty > soon you discover that almost no C developers actually understand what > goes on with unsigned, what unsigned arithmetic is. Things like that > made C complex. The language part of Java is, I think, pretty > simple. The libraries you have to look up.
On the other hand.... According to http://www.artima.com/weblogs/viewpost.jsp?thread=7555 > Once Upon an Oak ... > by Heinz Kabutz > July 15, 2003 > ... > Trying to fill my gaps of Java's history, I started digging around on > Sun's website, and eventually stumbled across the Oak Language > Specification for Oak version 0.2. Oak was the original name of what > is now commonly known as Java, and this manual is the oldest manual > available for Oak (i.e. Java). ... > Unsigned integer values (Section 3.1) > > The specification says: "The four integer types of widths of 8, 16, 32 > and 64 bits, and are signed unless prefixed by the unsigned modifier. > > In the sidebar it says: "unsigned isn't implemented yet; it might > never be." How right you were. "*
Solution 3 - Java
I'm not aware of any direct quotes from James Gosling, but there's an official RFE for unsigned byte
:
> ### Bug ID: 4186775: request unsigned integer types, esp. unsigned byte
> State: 11-Closed, Will Not Fix, request for enhancement
> Please extend the Java design to allow unsigned types, particularly
unsigned byte
.
> I have been wondering why there are no unsigned integer types in Java. It
seems to me that for byte
-length values it is extremely awkward not to have
them [...]
> I recognize that this was a design decision made by the Java developers. What I don't understand is why. Did they consider unsigned integer types evil or harmful, and chose to protect me from myself?
Solution 4 - Java
There's no reason for a byte
to be unsigned. when you have char
type to represent characters, the byte
would normally not do that job of a char
.