Are there machines, where sizeof(char) != 1, or at least CHAR_BIT > 8?
CCharStandardsSizeofC99C Problem Overview
Are there machines (or compilers), where sizeof(char) != 1
?
Does C99 standard says that sizeof(char)
on standard compliance implementation MUST be exactly 1? If it does, please, give me section number and citation.
Update:
If I have a machine (CPU), which can't address bytes (minimal read is 4 bytes, aligned), but only 4-s of bytes (uint32_t
), can compiler for this machine define sizeof(char)
to 4?sizeof(char)
will be 1, but char will have 32 bits (CHAR_BIT
macros)
Update2: But sizeof result is NOT a BYTES ! it is the size of CHAR. And char can be 2 byte, or (may be) 7 bit?
Update3:
Ok. All machines have sizeof(char) == 1
. But what machines have CHAR_BIT > 8
?
C Solutions
Solution 1 - C
It is always one in C99, section 6.5.3.4:
> When applied to an operand that has
> type char
, unsigned char
, or signed > char
, (or a qualified version thereof)
> the result is 1
.
Edit: not part of your question, but for interest from Harbison and Steele's. C: A Reference Manual, Third Edition, Prentice Hall, 1991 (pre c99) p. 148:
> A storage unit is taken to be the
> amount of storage occupied by one
> character; the size of an object of
> type char
is therefore 1
.
Edit: In answer to your updated question, the following question and answer from Harbison and Steele is relevant (ibid, Ex. 4 of Ch. 6):
> Is it allowable to have a C
> implementation in which type char
can
> represent values ranging from
> -2,147,483,648 through 2,147,483,647? If so, what would be sizeof(char)
> under that implementation? What would
> be the smallest and largest ranges of
> type int
?
Answer (ibid, p. 382):
> It is permitted (if wasteful) for an
> implementation to use 32 bits to
> represent type char
. Regardless of
> the implementation, the value of
> sizeof(char)
is always 1.
While this does not specifically address a case where, say bytes are 8 bits and char
are 4 of those bytes (actually impossible with the c99 definition, see below), the fact that sizeof(char) = 1
always is clear from the c99 standard and Harbison and Steele.
Edit: In fact (this is in response to your upd 2 question), as far as c99 is concerned sizeof(char)
is in bytes, from section 6.5.3.4 again:
> The sizeof
operator yields the size
> (in bytes) of its operand
so combined with the quotation above, bytes of 8 bits and char
as 4 of those bytes is impossible: for c99 a byte is the same as a char
.
In answer to your mention of the possibility of a 7 bit char
: this is not possible in c99. According to section 5.2.4.2.1 of the standard the minimum is 8:
> Their implementation-defined values shall be equal or greater [my emphasis] in magnitude to those shown, with the same sign.
>
> — number of bits for smallest object that is not a bit-field (byte)
>
> CHAR_BIT 8
>
> — minimum value for an object of type signed char
>
> SCHAR_MIN -127
>
> — maximum value for an object of type signed char
>
> SCHAR_MAX +127
>
> — maximum value for an object of type unsigned char
>
> UCHAR_MAX 255
>
> — minimum value for an object of type char
>
> CHAR_MIN
see below
>
> — maximum value for an object of type char
>
> CHAR_MAX
see below
>
> [...]
>
> If the value of an object of type char
> is treated as a signed integer when
> used in an expression, the value of
> CHAR_MIN
shall be the same as that of
> SCHAR_MIN
and the value of CHAR_MAX
> shall be the same as that of
> SCHAR_MAX
. Otherwise, the value of
> CHAR_MIN
shall be 0
and the value of
> CHAR_MAX
shall be the same as that of
> UCHAR_MAX
. The value UCHAR_MAX
> shall equal 2CHAR_BIT − 1.
Solution 2 - C
There are no machines where sizeof(char)
is 4. It's always 1 byte. That byte might contain 32 bits, but as far as the C compiler is concerned, it's one byte. For more details, I'm actually going to point you at the C++ FAQ 26.6. That link covers it pretty well and I'm fairly certain C++ got all of those rules from C. You can also look at comp.lang.c FAQ 8.10 for characters larger than 8 bits.
> Upd2: But sizeof result is NOT a BYTES > ! it is the size of CHAR. And char can > be 2 byte, or (may be) 7 bit?
Yes, it is bytes. Let me say it again. sizeof(char)
is 1 byte according to the C compiler. What people colloquially call a byte (8 bits) is not necessarily the same as what the C compiler calls a byte. The number of bits in a C byte varies depending on your machine architecture. It's also guaranteed to be at least 8.
Solution 3 - C
PDP-10 and PDP-11 was.
Update: there like no C99 compilers for PDP-10.
Some models of Analog Devices 32-bit SHARC DSP have CHAR_BIT=32, and Texas Instruments DSP from TMS32F28xx have CHAR_BIT=16, reportedly.
Update: There is GCC 3.2 for PDP-10 with CHAR_BIT=9 (check include/limits.h in that archive).
Solution 4 - C
What you call "bytes" are better referred to as "octets". In C, "bytes" and "chars" mean the exact same thing - the smallest unit of memory.