byte + byte = int... why?

C#Type Conversion

C# Problem Overview


Looking at this C# code:

byte x = 1;
byte y = 2;
byte z = x + y; // ERROR: Cannot implicitly convert type 'int' to 'byte'

The result of any math performed on byte (or short) types is implicitly cast back to an integer. The solution is to explicitly cast the result back to a byte:

byte z = (byte)(x + y); // this works

What I am wondering is why? Is it architectural? Philosophical?

We have:

  • int + int = int
  • long + long = long
  • float + float = float
  • double + double = double

So why not:

  • byte + byte = byte
  • short + short = short?

A bit of background: I am performing a long list of calculations on "small numbers" (i.e. < 8) and storing the intermediate results in a large array. Using a byte array (instead of an int array) is faster (because of cache hits). But the extensive byte-casts spread through the code make it that much more unreadable.

C# Solutions


Solution 1 - C#

The third line of your code snippet:

byte z = x + y;

actually means

byte z = (int) x + (int) y;

So, there is no + operation on bytes, bytes are first cast to integers and the result of addition of two integers is a (32-bit) integer.

Solution 2 - C#

In terms of "why it happens at all" it's because there aren't any operators defined by C# for arithmetic with byte, sbyte, short or ushort, just as others have said. This answer is about why those operators aren't defined.

I believe it's basically for the sake of performance. Processors have native operations to do arithmetic with 32 bits very quickly. Doing the conversion back from the result to a byte automatically could be done, but would result in performance penalties in the case where you don't actually want that behaviour.

I think this is mentioned in one of the annotated C# standards. Looking...

EDIT: Annoyingly, I've now looked through the annotated ECMA C# 2 spec, the annotated MS C# 3 spec and the annotation CLI spec, and none of them mention this as far as I can see. I'm sure I've seen the reason given above, but I'm blowed if I know where. Apologies, reference fans :(

Solution 3 - C#

I thought I had seen this somewhere before. From this article, The Old New Thing:

> Suppose we lived in a fantasy world > where operations on 'byte' resulted in > 'byte'. >

byte b = 32;
byte c = 240;
int i = b + c; // what is i?

> > In this fantasy world, the value of i > would be 16! Why? Because the two > operands to the + operator are both > bytes, so the sum "b+c" is computed as > a byte, which results in 16 due to > integer overflow. (And, as I noted > earlier, integer overflow is the new > security attack vector.)

EDIT: Raymond is defending, essentially, the approach C and C++ took originally. In the comments, he defends the fact that C# takes the same approach, on the grounds of language backward compatibility.

Solution 4 - C#

C#

ECMA-334 states that addition is only defined as legal on int+int, uint+uint, long+long and ulong+ulong (ECMA-334 14.7.4). As such, these are the candidate operations to be considered with respect to 14.4.2. Because there are implicit casts from byte to int, uint, long and ulong, all the addition function members are applicable function members under 14.4.2.1. We have to find the best implicit cast by the rules in 14.4.2.3:

Casting(C1) to int(T1) is better than casting(C2) to uint(T2) or ulong(T2) because:

  • If T1 is int and T2 is uint, or ulong, C1 is the better conversion.

Casting(C1) to int(T1) is better than casting(C2) to long(T2) because there is an implicit cast from int to long:

  • If an implicit conversion from T1 to T2 exists, and no implicit conversion from T2 to T1 exists, C1 is the better conversion.

Hence the int+int function is used, which returns an int.

Which is all a very long way to say that it's buried very deep in the C# specification.

CLI

The CLI operates only on 6 types (int32, native int, int64, F, O, and &). (ECMA-335 partition 3 section 1.5)

Byte (int8) is not one of those types, and is automatically coerced to an int32 before the addition. (ECMA-335 partition 3 section 1.6)

Solution 5 - C#

The answers indicating some inefficiency adding bytes and truncating the result back to a byte are incorrect. x86 processors have instructions specifically designed for integer operation on 8-bit quantities.

In fact, for x86/64 processors, performing 32-bit or 16-bit operations are less efficient than 64-bit or 8-bit operations due to the operand prefix byte that has to be decoded. On 32-bit machines, performing 16-bit operations entail the same penalty, but there are still dedicated opcodes for 8-bit operations.

Many RISC architectures have similar native word/byte efficient instructions. Those that don't generally have a store-and-convert-to-signed-value-of-some-bit-length.

In other words, this decision must have been based on perception of what the byte type is for, not due to underlying inefficiencies of hardware.

Solution 6 - C#

I remember once reading something from Jon Skeet (can't find it now, I'll keep looking) about how byte doesn't actually overload the + operator. In fact, when adding two bytes like in your sample, each byte is actually being implicitly converted to an int. The result of that is obviously an int. Now as to WHY this was designed this way, I'll wait for Jon Skeet himself to post :)

EDIT: Found it! Great info about this very topic here.

Solution 7 - C#

This is because of overflow and carries.

If you add two 8 bit numbers, they might overflow into the 9th bit.

Example:

  1111 1111
+ 0000 0001
-----------
1 0000 0000

I don't know for sure, but I assume that ints, longs, and doubles are given more space because they are pretty large as it is. Also, they are multiples of 4, which are more efficient for computers to handle, due to the width of the internal data bus being 4 bytes or 32 bits (64 bits is getting more prevalent now) wide. Byte and short are a little more inefficient, but they can save space.

Solution 8 - C#

From the C# language spec 1.6.7.5 7.2.6.2 Binary numeric promotions it converts both operands to int if it can't fit it into several other categories. My guess is they didn't overload the + operator to take byte as a parameter but want it to act somewhat normally so they just use the int data type.

C# language Spec

Solution 9 - C#

My suspicion is that C# is actually calling the operator+ defined on int (which returns an int unless you are in a checked block), and implicitly casting both of your bytes/shorts to ints. That's why the behavior appears inconsistent.

Solution 10 - C#

This was probably a practical decision on the part of the language designers. After all, an int is an Int32, a 32-bit signed integer. Whenever you do an integer operation on a type smaller than int, it's going to be converted to a 32 bit signed int by most any 32 bit CPU anyway. That, combined with the likelihood of overflowing small integers, probably sealed the deal. It saves you from the chore of continuously checking for over/under-flow, and when the final result of an expression on bytes would be in range, despite the fact that at some intermediate stage it would be out of range, you get a correct result.

Another thought: The over/under-flow on these types would have to be simulated, since it wouldn't occur naturally on the most likely target CPUs. Why bother?

Solution 11 - C#

This is for the most part my answer that pertains to this topic, submitted first to a similar question here.

All operations with integral numbers smaller than Int32 are rounded up to 32 bits before calculation by default. The reason why the result is Int32 is simply to leave it as it is after calculation. If you check the MSIL arithmetic opcodes, the only integral numeric type they operate with are Int32 and Int64. It's "by design".

If you desire the result back in Int16 format, it is irrelevant if you perform the cast in code, or the compiler (hypotetically) emits the conversion "under the hood".

For example, to do Int16 arithmetic:

short a = 2, b = 3;

short c = (short) (a + b);

The two numbers would expand to 32 bits, get added, then truncated back to 16 bits, which is how MS intended it to be.

The advantage of using short (or byte) is primarily storage in cases where you have massive amounts of data (graphical data, streaming, etc.)

Solution 12 - C#

Addition is not defined for bytes. So they are cast to int for the addition. This true for most math operations and bytes. (note this is how it used to be in older languages, I am assuming that it hold true today).

Solution 13 - C#

I think it's a design decission about which operation was more common... If byte+byte = byte maybe much more people will be bothered by having to cast to int when an int is required as result.

Solution 14 - C#

From .NET Framework code:

// bytes
private static object AddByte(byte Left, byte Right)
{
    short num = (short) (Left + Right);
    if (num > 0xff)
    {
        return num;
    }
    return (byte) num;
}

// shorts (int16)
private static object AddInt16(short Left, short Right)
{
    int num = Left + Right;
    if ((num <= 0x7fff) && (num >= -32768))
    {
        return (short) num;
    }
    return num;
}
 

Simplify with .NET 3.5 and above:

public static class Extensions 
{
    public static byte Add(this byte a, byte b)
    {
        return (byte)(a + b);
    }
}

now you can do:

byte a = 1, b = 2, c;
c = a.Add(b);

Solution 15 - C#

I've test performance between byte and int.
With int values :

class Program
{
    private int a,b,c,d,e,f;

    public Program()
    {
        a = 1;
        b = 2;
        c = (a + b);
        d = (a - b);
        e = (b / a);
        f = (c * b);
    }

    static void Main(string[] args)
    {
        int max = 10000000;
        DateTime start = DateTime.Now;
        Program[] tab = new Program[max];

        for (int i = 0; i < max; i++)
        {
            tab[i] = new Program();
        }
        DateTime stop = DateTime.Now;

        Debug.WriteLine(stop.Subtract(start).TotalSeconds);
    }
}

With byte values :

class Program
{
    private byte a,b,c,d,e,f;

    public Program()
    {
        a = 1;
        b = 2;
        c = (byte)(a + b);
        d = (byte)(a - b);
        e = (byte)(b / a);
        f = (byte)(c * b);
    }

    static void Main(string[] args)
    {
        int max = 10000000;
        DateTime start = DateTime.Now;
        Program[] tab = new Program[max];

        for (int i = 0; i < max; i++)
        {
            tab[i] = new Program();
        }
        DateTime stop = DateTime.Now;

        Debug.WriteLine(stop.Subtract(start).TotalSeconds);
    }
}

Here the result:
byte : 3.57s 157mo, 3.71s 171mo, 3.74s 168mo with CPU ~= 30%
int : 4.05s 298mo, 3.92s 278mo, 4.28 294mo with CPU ~= 27%
Conclusion :
byte use more the CPU but it cost les memory and it's faster (maybe because there are less byte to alloc)

Solution 16 - C#

In addition to all the other great comments, I thought I would add one little tidbit. A lot of comments have wondered why int, long, and pretty much any other numeric type doesn't also follow this rule...return a "bigger" type in response to arithmatic.

A lot of answers have had to do with performance (well, 32bits is faster than 8bits). In reality, an 8bit number is still a 32bit number to a 32bit CPU....even if you add two bytes, the chunk of data the cpu operates on is going to be 32bits regardless...so adding ints is not going to be any "faster" than adding two bytes...its all the same to the cpu. NOW, adding two ints WILL be faster than adding two longs on a 32bit processor, because adding two longs requires more microops since you're working with numbers wider than the processors word.

I think the fundamental reason for causing byte arithmetic to result in ints is pretty clear and straight forward: 8bits just doesn't go very far! :D With 8 bits, you have an unsigned range of 0-255. That's not a whole lot of room to work with...the likelyhood that you are going to run into a bytes limitations is VERY high when using them in arithmetic. However, the chance that you're going to run out of bits when working with ints, or longs, or doubles, etc. is significantly lower...low enough that we very rarely encounter the need for more.

Automatic conversion from byte to int is logical because the scale of a byte is so small. Automatic conversion from int to long, float to double, etc. is not logical because those numbers have significant scale.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRobert CartainoView Question on Stackoverflow
Solution 1 - C#azheglovView Answer on Stackoverflow
Solution 2 - C#Jon SkeetView Answer on Stackoverflow
Solution 3 - C#Michael PetrottaView Answer on Stackoverflow
Solution 4 - C#Alun HarfordView Answer on Stackoverflow
Solution 5 - C#ChristopherView Answer on Stackoverflow
Solution 6 - C#BFreeView Answer on Stackoverflow
Solution 7 - C#samozView Answer on Stackoverflow
Solution 8 - C#RyanView Answer on Stackoverflow
Solution 9 - C#mqpView Answer on Stackoverflow
Solution 10 - C#PeterAllenWebbView Answer on Stackoverflow
Solution 11 - C#Kenan E. K.View Answer on Stackoverflow
Solution 12 - C#Jim CView Answer on Stackoverflow
Solution 13 - C#fortranView Answer on Stackoverflow
Solution 14 - C#serhioView Answer on Stackoverflow
Solution 15 - C#puipuixView Answer on Stackoverflow
Solution 16 - C#jristaView Answer on Stackoverflow