Why is this struct size 3 instead of 2?

C++CStruct

C++ Problem Overview


I have defined this struct:

typedef struct
{
    char A:3;
    char B:3;
    char C:3;
    char D:3;
    char E:3;
} col; 

The sizeof(col) give me the output of 3, but shouldn't it be 2? If I comment just one element, the sizeof is 2. I don't understand why: five element of 3 bits are equal to 15 bits, and that's less than 2 bytes.

Is there an "internal size" in defining a structure like this one? I just need a clarification, because from my notion of the language so far, I expected a size of 2 byte, not 3.

C++ Solutions


Solution 1 - C++

Because you are using char as the underlying type for your fields, the compiler tries to group bits by bytes, and since it cannot put more than eight bits in each byte, it can only store two fields per byte.

The total sum of bits your struct uses is 15, so the ideal size to fit that much data would be a short.

#include <stdio.h>

typedef struct
{
  char A:3;
  char B:3;
  char C:3;
  char D:3;
  char E:3;
} col; 


typedef struct {
  short A:3;
  short B:3;
  short C:3;
  short D:3;
  short E:3;
} col2; 


int main(){

  printf("size of col: %lu\n", sizeof(col));
  printf("size of col2: %lu\n", sizeof(col2));

}

The above code (for a 64-bit platform like mine) will indeed yield 2 for the second struct. For anything larger than a short, the struct will fill no more than one element of the used type, so - for that same platform - the struct will end up with size four for int , eight for long, etc.

Solution 2 - C++

Because you can't have a bit packet field that spans across the minimum alignment boundary (which is 1 byte) so they'll probably get packed like

byte 1
  A : 3
  B : 3
  padding : 2
byte 2
  C : 3
  D : 3
  padding : 2
byte 3
  E : 3
  padding : 5

(the orders of field/padding inside the same byte is not intentional, it's just to give you the idea, since the compiler could laid them down how it prefers)

Solution 3 - C++

The first two bit fields fit into a single char. The third cannot fit into that char and needs a new one. 3 + 3 + 3 = 9 which doesn't fit into an 8 bit char.

So the first pair takes a char, the second pair takes a char, and the last bit field get a third char.

Solution 4 - C++

Most compilers allow you to control the padding, e.g. using #pragmas. Here's an example with GCC 4.8.1:

#include <stdio.h>

typedef struct
{
    char A:3;
    char B:3;
    char C:3;
    char D:3;
    char E:3;
} col;

#pragma pack(push, 1)
typedef struct {
    char A:3;
    char B:3;
    char C:3;
    char D:3;
    char E:3;
} col2;
#pragma pack(pop)

int main(){
    printf("size of col: %lu\n", sizeof(col));  // 3
    printf("size of col2: %lu\n", sizeof(col2));  // 2
}

Note that the default behaviour of the compiler is there for a reason and will probably give you better performance.

Solution 5 - C++

Even though the ANSI C standard specifies too little about how bitfields are packed to offer any significant advantage over "compilers are allowed to pack bitfields however they see fit", it nonetheless in many cases forbids compilers from packing things in the most efficient fashion.

In particular, if a structure contains bitfields, a compiler is required to store it as a structure which contains one or more anonymous fields of some "normal" storage type and then logically subdivide each such field into its constituent bitfield parts. Thus, given:

unsigned char foo1: 3;
unsigned char foo2: 3;
unsigned char foo3: 3;
unsigned char foo4: 3;
unsigned char foo5: 3;
unsigned char foo6: 3;
unsigned char foo7: 3;

If unsigned char is 8 bits, the compiler would be required to allocate four fields of that type, and assign two bitfields to all but one (which would be in a char field of its own). If all char declarations had been replaced with short, then there would be two fields of type short, one of which would hold five bitfields and the other of which would hold the remaining two.

On a processor without alignment restrictions, the data could be laid out more efficiently by using unsigned short for the first five fields and unsigned char for the last two, storing seven three-bit fields in three bytes. While it should be possible to store eight three-bit fields in three bytes, a compiler could only allow that if there existed a three-byte numeric type which could be used as the "outer field" type.

Personally, I consider bitfields as defined to be basically useless. If code needs to work with binary-packed data, it should explicitly define storage locations of actual types, and then use macros or some other such means to access the bits thereof. It would be helpful if C supported a syntax like:

unsigned short f1;
unsigned char f2;
union foo1 = f1:0.3;
union foo2 = f1:3.3;
union foo3 = f1:6.3;
union foo4 = f1:9.3;
union foo5 = f1:12.3;
union foo6 = f2:0.3;
union foo7 = f2:3.3;

Such a syntax, if allowed, would make it possible for code to use bitfields in a portable fashion, without regard for word sizes or byte orderings (foo0 would be in the three least-significant bits of f1, but those could be stored at the lower or higher address). Absent such a feature, however, macros are probably the only portable way to operate with such things.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRaffaelloView Question on Stackoverflow
Solution 1 - C++didiercView Answer on Stackoverflow
Solution 2 - C++JackView Answer on Stackoverflow
Solution 3 - C++2501View Answer on Stackoverflow
Solution 4 - C++KosView Answer on Stackoverflow
Solution 5 - C++supercatView Answer on Stackoverflow