How to create type safe enums?

CEnumsType Safety

C Problem Overview


To achieve type safety with enums in C is problematic, since they are essentially just integers. And enumeration constants are in fact defined to be of type int by the standard.

To achieve a bit of type safety I do tricks with pointers like this:

typedef enum
{
  BLUE,
  RED
} color_t;

void color_assign (color_t* var, color_t val) 
{ 
  *var = val; 
}

Because pointers have stricter type rules than values, so this prevents code such as this:

int x; 
color_assign(&x, BLUE); // compiler error

But it doesn't prevent code like this:

color_t color;
color_assign(&color, 123); // garbage value

This is because the enumeration constant is essentially just an int and can get implicitly assigned to an enumeration variable.

Is there a way to write such a function or macro color_assign, that can achieve complete type safety even for enumeration constants?

C Solutions


Solution 1 - C

It is possible to achieve this with a few tricks. Given

typedef enum
{
  BLUE,
  RED
} color_t;

Then define a dummy union which won't be used by the caller, but contains members with the same names as the enumeration constants:

typedef union
{
  color_t BLUE;
  color_t RED;
} typesafe_color_t;

This is possible because enumeration constants and member/variable names reside in different namespaces.

Then make some function-like macros:

#define c_assign(var, val) (var) = (typesafe_color_t){ .val = val }.val
#define color_assign(var, val) _Generic((var), color_t: c_assign(var, val))

These macros are then called like this:

color_t color;
color_assign(color, BLUE); 

Explanation:

  • The C11 _Generic keyword ensures that the enumeration variable is of the correct type. However, this can't be used on the enumeration constant BLUE because it is of type int.
  • Therefore the helper macro c_assign creates a temporary instance of the dummy union, where the designated initializer syntax is used to assign the value BLUE to a union member named BLUE. If no such member exists, the code won't compile.
  • The union member of the corresponding type is then copied into the enum variable.

We actually don't need the helper macro, I just split the expression for readability. It works just as fine to write

#define color_assign(var, val) _Generic((var), \
color_t: (var) = (typesafe_color_t){ .val = val }.val )

Examples:

color_t color; 
color_assign(color, BLUE);// ok
color_assign(color, RED); // ok

color_assign(color, 0);   // compiler error 

int x;
color_assign(x, BLUE);    // compiler error

typedef enum { foo } bar;
color_assign(color, foo); // compiler error
color_assign(bar, BLUE);  // compiler error

EDIT

Obviously the above doesn't prevent the caller from simply typing color = garbage;. If you wish to entirely block the possibility of using such assignment of the enum, you can put it in a struct and use the standard procedure of private encapsulation with "opaque type":

color.h

#include <stdlib.h>

typedef enum
{
  BLUE,
  RED
} color_t;

typedef union
{
  color_t BLUE;
  color_t RED;
} typesafe_color_t;

typedef struct col_t col_t; // opaque type

col_t* col_alloc (void);
void   col_free (col_t* col);

void col_assign (col_t* col, color_t color);

#define color_assign(var, val)   \
  _Generic( (var),               \
    col_t*: col_assign((var), (typesafe_color_t){ .val = val }.val) \
  )

color.c

#include "color.h"

struct col_t
{
  color_t color;
};

col_t* col_alloc (void) 
{ 
  return malloc(sizeof(col_t)); // (needs proper error handling)
}

void col_free (col_t* col)
{
  free(col);
}

void col_assign (col_t* col, color_t color)
{
  col->color = color;
}

main.c

col_t* color;
color = col_alloc();

color_assign(color, BLUE); 

col_free(color);

Solution 2 - C

The top answer's pretty good, but it has the downsides that it requires a lot of the C99 and C11 feature set in order to compile, and on top of that, it makes assignment pretty unnatural: You have to use a magic color_assign() function or macro in order to move data around instead of the standard = operator.

(Admittedly, the question explicitly asked about how to write color_assign(), but if you look at the question more broadly, it's really about how to change your code to get type-safety with some form of enumerated constants, and I'd consider not needing color_assign() in the first place to get type-safety to be fair game for the answer.)

Pointers are among the few shapes that C treats as type-safe, so they make a natural candidate for solving this problem. So I'd attack it this way: Rather than using an enum, I'd sacrifice a little memory to be able to have unique, predictable pointer values, and then use some really hokey funky #define statements to construct my "enum" (yes, I know macros pollute the macro namespace, but enum pollutes the compiler's global namespace, so I consider it close to an even trade):

color.h:

typedef struct color_struct_t *color_t;

struct color_struct_t { char dummy; };

extern struct color_struct_t color_dummy_array[];

#define UNIQUE_COLOR(value) \
    (&color_dummy_array[value])

#define RED    UNIQUE_COLOR(0)
#define GREEN  UNIQUE_COLOR(1)
#define BLUE   UNIQUE_COLOR(2)

enum { MAX_COLOR_VALUE = 2 };

This does, of course, require that you have just enough memory reserved somewhere to ensure nothing else can ever take on those pointer values:

color.c:

#include "color.h"

/* This never actually gets used, but we need to declare enough space in the
 * BSS so that the pointer values can be unique and not accidentally reused
 * by anything else. */
struct color_struct_t color_dummy_array[MAX_COLOR_VALUE + 1];

But from the consumer's perspective, this is all hidden: color_t is very nearly an opaque object. You can't assign anything to it other than valid color_t values and NULL:

user.c:

#include <stddef.h>
#include "color.h"

void foo(void)
{
    color_t color = RED;    /* OK */
    color_t color = GREEN;  /* OK */
    color_t color = NULL;   /* OK */
    color_t color = 27;     /* Error/warning */
}

This works well in most cases, but it does have the problem of not working in switch statements; you can't switch on a pointer (which is a shame). But if you're willing to add one more macro to make switching possible, you can arrive at something that's "good enough":

color.h:

...

#define COLOR_NUMBER(c) \
    ((c) - color_dummy_array)

user.c:

...

void bar(color_t c)
{
    switch (COLOR_NUMBER(c)) {
        case COLOR_NUMBER(RED):
            break;
        case COLOR_NUMBER(GREEN):
            break;
        case COLOR_NUMBER(BLUE):
            break;
    }
}

Is this a good solution? I wouldn't call it great, since it both wastes some memory and pollutes the macro namespace, and it doesn't let you use enum to automatically assign your color values, but it is another way to solve the problem that results in somewhat more natural usages, and unlike the top answer, it works all the way back to C89.

Solution 3 - C

One could enforce type safety with a struct:

struct color { enum { THE_COLOR_BLUE, THE_COLOR_RED } value; };
const struct color BLUE = { THE_COLOR_BLUE };
const struct color RED  = { THE_COLOR_RED  };

Since color is just a wrapped integer, it can be passed by value or by pointer as one would do with an int. With this definition of color, color_assign(&val, 3); fails to compile with:

> error: incompatible type for argument 2 of 'color_assign' > > color_assign(&val, 3); > ^


Full (working) example:

struct color { enum { THE_COLOR_BLUE, THE_COLOR_RED } value; };
const struct color BLUE = { THE_COLOR_BLUE };
const struct color RED  = { THE_COLOR_RED  };

void color_assign (struct color* var, struct color val) 
{ 
  var->value = val.value; 
}

const char* color_name(struct color val)
{
  switch (val.value)
  {
    case THE_COLOR_BLUE: return "BLUE";
    case THE_COLOR_RED:  return "RED";
    default:             return "?";
  }
}

int main(void)
{
  struct color val;
  color_assign(&val, BLUE);
  printf("color name: %s\n", color_name(val)); // prints "BLUE"
}

Play with in online (demo).

Solution 4 - C

Ultimately, what you want is a warning or error when you use an invalid enumeration value.

As you say, the C language cannot do this. However you can easily use a static analysis tool to catch this problem - Clang is the obvious free one, but there are plenty of others. Regardless of whether the language is type-safe, static analysis can detect and report the problem. Typically a static analysis tool puts up warnings, not errors, but you can easily have the static analysis tool report an error instead of a warning, and change your makefile or build project to handle this.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLundinView Question on Stackoverflow
Solution 1 - CLundinView Answer on Stackoverflow
Solution 2 - CSean WerkemaView Answer on Stackoverflow
Solution 3 - CYSCView Answer on Stackoverflow
Solution 4 - CGrahamView Answer on Stackoverflow