Extending a struct in C

CStruct

C Problem Overview


I recently came across a colleague's code that looked like this:

typedef struct A {
  int x;
}A;

typedef struct B {
  A a;
  int d;
}B;

void fn(){
  B *b;
  ((A*)b)->x = 10;
}

His explanation was that since struct A was the first member of struct B, so b->x would be the same as b->a.x and provides better readability.
This makes sense, but is this considered good practice? And will this work across platforms? Currently this runs fine on GCC.

C Solutions


Solution 1 - C

Yes, it will work cross-platform(a), but that doesn't necessarily make it a good idea.

As per the ISO C standard (all citations below are from C11), 6.7.2.1 Structure and union specifiers /15, there is not allowed to be padding before the first element of a structure

In addition, 6.2.7 Compatible type and composite type states that:

>Two types have compatible type if their types are the same

and it is undisputed that the A and A-within-B types are identical.

This means that the memory accesses to the A fields will be the same in both A and B types, as would the more sensible b->a.x which is probably what you should be using if you have any concerns about maintainability in future.

And, though you would normally have to worry about strict type aliasing, I don't believe that applies here. It is illegal to alias pointers but the standard has specific exceptions.

6.5 Expressions /7 states some of those exceptions, with the footnote:

>The intent of this list is to specify those circumstances in which an object may or may not be aliased.

The exceptions listed are:

  • a type compatible with the effective type of the object;
  • some other exceptions which need not concern us here; and
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union).

That, combined with the struct padding rules mentioned above, including the phrase:

>A pointer to a structure object, suitably converted, points to its initial member

seems to indicate this example is specifically allowed for. The core point we have to remember here is that the type of the expression ((A*)b) is A*, not B*. That makes the variables compatible for the purposes of unrestricted aliasing.

That's my reading of the relevant portions of the standard, I've been wrong before (b), but I doubt it in this case.

So, if you have a genuine need for this, it will work okay but I'd be documenting any constraints in the code very close to the structures so as to not get bitten in future.


(a) In the general sense. Of course, the code snippet:

B *b;
((A*)b)->x = 10;

will be undefined behaviour because b is not initialised to something sensible. But I'm going to assume this is just example code meant to illustrate your question. If anyone's concerned about it, think of it instead as:

B b, *pb = &b;
((A*)pb)->x = 10;

(b) As my wife will tell you, frequently and with little prompting :-)

Solution 2 - C

I'll go out on a limb and oppose @paxdiablo on this one: I think it's a fine idea, and it's very common in large, production-quality code.

It's basically the most obvious and nice way to implement inheritance-based object oriented data structures in C. Starting the declaration of struct B with an instance of struct A means "B is a sub-class of A". The fact that the first structure member is guaranteed to be 0 bytes from the start of the structure is what makes it work safely, and it's borderline beautiful in my opinion.

It's widely used and deployed in code based on the GObject library, such as the GTK+ user interface toolkit and the GNOME desktop environment.

Of course, it requires you to "know what you're doing", but that is generally always the case when implementing complicated type relationships in C. :)

In the case of GObject and GTK+, there's plenty of support infrastructure and documentation to help with this: it's quite hard to forget about it. It might mean that creating a new class isn't something you do just as quickly as in C++, but that's perhaps to be expected since there's no native support in C for classes.

Solution 3 - C

That's a horrible idea. As soon as someone comes along and inserts another field at the front of struct B your program blows up. And what is so wrong with b.a.x?

Solution 4 - C

Anything that circumvents type checking should generally be avoided. This hack rely on the order of the declarations and neither the cast nor this order can be enforced by the compiler.

It should work cross-platform, but I don't think it is a good practice.

If you really have deeply nested structures (you might have to wonder why, however), then you should use a temporary local variable to access the fields:

A deep_a = e->d.c.b.a;
deep_a.x = 10;
deep_a.y = deep_a.x + 72;
e->d.c.b.a = deep_a;

Or, if you don't want to copy a along:

A* deep_a = &(e->d.c.b.a);
deep_a->x = 10;
deep_a->y = deep_a->x + 72;

This shows from where a comes and it doesn't require a cast.

Java and C# also regularly expose constructs like "c.b.a", I don't see what the problem is. If what you want to simulate is object-oriented behaviour, then you should consider using an object-oriented language (like C++), since "extending structs" in the way you propose doesn't provide encapsulation nor runtime polymorphism (although one may argue that ((A*)b) is akin to a "dynamic cast").

Solution 5 - C

I am sorry to disagree with all the other answers here, but this system is not compliant to standard C. It is not acceptable to have two pointers with different types which point to the same location at the same time, this is called aliasing and is not allowed by the strict aliasing rules in C99 and many other standards. A less ugly was of doing this would be to use in-line getter functions which then do not have to look neat in that way. Or perhaps this is the job for a union? Specifically allowed to hold one of several types, however there are a myriad of other drawbacks there too.

In short, this kind of dirty casting to create polymorphism is not allowed by most C standards, just because it seems to work on your compiler does not mean it is acceptable. See here for an explanation of why it is not allowed, and why compilers at high optimization levels can break code which does not follow these rules http://en.wikipedia.org/wiki/Aliasing_%28computing%29#Conflicts_with_optimization

Solution 6 - C

Yes, it will work. And it is one of the core principle of Object Oriented using C. See this answer 'https://stackoverflow.com/questions/415452/object-orientation-in-c/415536#415536'; for more examples about extending (i.e inheritance).

Solution 7 - C

This is perfectly legal, and, in my opinion, pretty elegant. For an example of this in production code, see the GObject docs:

> Thanks to these simple conditions, it is possible to detect the type > of every object instance by doing: > > B *b; > b->parent.parent.g_class->g_type > > or, more quickly: > > B b; > ((GTypeInstance)b)->g_class->g_type

Personally, I think that unions are ugly and tend to lead towards huge switch statements, which is a big part of what you've worked to avoid by writing OO code. I write a significant amount of code myself in this style --- typically, the first member of the struct contains function pointers that can be made to work like a vtable for the type in question.

Solution 8 - C

I can see how this works but I would not call this good practice. This is depending on how the bytes of each data structure is placed in memory. Any time you are casting one complicated data structure to another (ie. structs), it's not a very good idea, especially when the two structures are not the same size.

Solution 9 - C

I think the OP and many commenters have latched onto the idea that the code is extending a struct.

It is not.

This is and example of composition. Very useful. (Getting rid of the typedefs, here is a more descriptive example ):

struct person {
  char name[MAX_STRING + 1];
  char address[MAX_STRING + 1];
}

struct item {
  int x;
};

struct accessory {
  int y;
};

/* fixed size memory buffer.
   The Linux kernel is full of embedded structs like this
*/
struct order {
  struct person customer;
  struct item items[MAX_ITEMS];
  struct accessory accessories[MAX_ACCESSORIES];
};

void fn(struct order *the_order){
  memcpy(the_order->customer.name, DEFAULT_NAME, sizeof(DEFAULT_NAME));
}

You have a fixed size buffer that is nicely compartmentalized. It sure beats a giant single tier struct.

struct double_order {
  struct order order;
  struct item extra_items[MAX_ITEMS];
  struct accessory extra_accessories[MAX_ACCESSORIES];
  
};

So now you have a second struct that can be treated (a la inheritance) exactly like the first with an explicit cast.

struct double_order d;
fn((order *)&d);

This preserves compatibility with code that was written to work with the smaller struct. Both the Linux kernel (http://lxr.free-electrons.com/source/include/linux/spi/spi.h (look at struct spi_device)) and bsd sockets library (http://beej.us/guide/bgnet/output/html/multipage/sockaddr_inman.html) use this approach. In the kernel and sockets cases you have a struct that is run through both generic and differentiated sections of code. Not all that different than the use case for inheritance.

I would NOT suggest writing structs like that just for readability.

Solution 10 - C

I think Postgres does this in some of their code as well. Not that it makes it a good idea, but it does say something about how widely accepted it seems to be.

Solution 11 - C

Perhaps you can consider using macros to implement this feature, the need to reuse the function or field into the macro.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionrubndsouzaView Question on Stackoverflow
Solution 1 - CpaxdiabloView Answer on Stackoverflow
Solution 2 - CunwindView Answer on Stackoverflow
Solution 3 - CjaketView Answer on Stackoverflow
Solution 4 - CdureuillView Answer on Stackoverflow
Solution 5 - CValityView Answer on Stackoverflow
Solution 6 - CItsMeView Answer on Stackoverflow
Solution 7 - CPatrick CollinsView Answer on Stackoverflow
Solution 8 - Cstack smasherView Answer on Stackoverflow
Solution 9 - CJoshua ClaytonView Answer on Stackoverflow
Solution 10 - CChristian ConveyView Answer on Stackoverflow
Solution 11 - Cjiajia jiangView Answer on Stackoverflow