Detecting endianness programmatically in a C++ program

C++AlgorithmEndianness

C++ Problem Overview


Is there a programmatic way to detect whether or not you are on a big-endian or little-endian architecture? I need to be able to write code that will execute on an Intel or PPC system and use exactly the same code (i.e. no conditional compilation).

C++ Solutions


Solution 1 - C++

I don't like the method based on type punning - it will often be warned against by compiler. That's exactly what unions are for !

bool is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1; 
}

The principle is equivalent to the type case as suggested by others, but this is clearer - and according to C99, is guaranteed to be correct. gcc prefers this compared to the direct pointer cast.

This is also much better than fixing the endianness at compile time - for OS which support multi-architecture (fat binary on Mac os x for example), this will work for both ppc/i386, whereas it is very easy to mess things up otherwise.

Solution 2 - C++

You can use std::endian if you have access to C++20 compiler such as GCC 8+ or Clang 7+.

Note: std::endian began in <type_traits> but was moved to <bit> at 2019 Cologne meeting. GCC 8, Clang 7, 8 and 9 have it in <type_traits> while GCC 9+ and Clang 10+ have it in <bit>.

#include <bit>

if constexpr (std::endian::native == std::endian::big)
{
    // Big endian system
}
else if constexpr (std::endian::native == std::endian::little)
{
    // Little endian system
}
else
{
    // Something else
}

Solution 3 - C++

You can do it by setting an int and masking off bits, but probably the easiest way is just to use the built in network byte conversion ops (since network byte order is always big endian).

if ( htonl(47) == 47 ) {
  // Big endian
} else {
  // Little endian.
}

Bit fiddling could be faster, but this way is simple, straightforward and pretty impossible to mess up.

Solution 4 - C++

Please see this article:

> Here is some code to determine what is > the type of your machine > > int num = 1; > if(*(char *)&num == 1) > { > printf("\nLittle-Endian\n"); > } > else > { > printf("Big-Endian\n"); > }

Solution 5 - C++

This is normally done at compile time (specially for performance reason) by using the header files available from the compiler or create your own. On linux you have the header file "/usr/include/endian.h"

Solution 6 - C++

I surprised no-one has mentioned the macros which the pre-processor defines by default. While these will vary depending on your platform; they are much cleaner than having to write your own endian-check.

For example; if we look at the built-in macros which GCC defines (on an X86-64 machine):

:| gcc -dM -E -x c - |grep -i endian
#define __LITTLE_ENDIAN__ 1

On a PPC machine I get:

:| gcc -dM -E -x c - |grep -i endian
#define __BIG_ENDIAN__ 1
#define _BIG_ENDIAN 1

(The :| gcc -dM -E -x c - magic prints out all built-in macros).

Solution 7 - C++

Ehm... It surprises me that noone has realized that the compiler will simply optimize the test out, and will put a fixed result as return value. This renders all code examples above, effectively useless. The only thing that would be returned is the endianness at compile-time! And yes, I tested all of the above examples. Here's an example with MSVC 9.0 (Visual Studio 2008).

Pure C code

int32 DNA_GetEndianness(void)
{
    union 
    {
        uint8  c[4];
        uint32 i;
    } u;

    u.i = 0x01020304;

    if (0x04 == u.c[0])
        return DNA_ENDIAN_LITTLE;
    else if (0x01 == u.c[0])
        return DNA_ENDIAN_BIG;
    else
        return DNA_ENDIAN_UNKNOWN;
}

Disassembly

PUBLIC	_DNA_GetEndianness
; Function compile flags: /Ogtpy
; File c:\development\dna\source\libraries\dna\endian.c
;	COMDAT _DNA_GetEndianness
_TEXT	SEGMENT
_DNA_GetEndianness PROC	                ; COMDAT

; 11   :     union 
; 12   :     {
; 13   :         uint8  c[4];
; 14   :         uint32 i;
; 15   :     } u;
; 16   : 
; 17   :     u.i = 1;
; 18   : 
; 19   :     if (1 == u.c[0])
; 20   :         return DNA_ENDIAN_LITTLE;

    mov	eax, 1

; 21   :     else if (1 == u.c[3])
; 22   :         return DNA_ENDIAN_BIG;
; 23   :     else
; 24   :        return DNA_ENDIAN_UNKNOWN;
; 25   : }

    ret
_DNA_GetEndianness ENDP
END

Perhaps it is possible to turn off ANY compile-time optimization for just this function, but I don't know. Otherwise it's maybe possible to hardcode it in assembly, although that's not portable. And even then even that might get optimized out. It makes me think I need some really crappy assembler, implement the same code for all existing CPUs/instruction sets, and well.... never mind.

Also, someone here said that endianness does not change during run-time. WRONG. There are bi-endian machines out there. Their endianness can vary durng execution. ALSO, there's not only Little Endian and Big Endian, but also other endiannesses (what a word).

I hate and love coding at the same time...

Solution 8 - C++

Declare an int variable:

int variable = 0xFF;

Now use char* pointers to various parts of it and check what is in those parts.

char* startPart = reinterpret_cast<char*>( &variable );
char* endPart = reinterpret_cast<char*>( &variable ) + sizeof( int ) - 1;

Depending on which one points to 0xFF byte now you can detect endianness. This requires sizeof( int ) > sizeof( char ), but it's definitely true for the discussed platforms.

Solution 9 - C++

Do not use a union!

C++ does not permit type punning via unions!
Reading from a union field that was not the last field written to is undefined behaviour!
Many compilers support doing so as an extension, but the language makes no guarantee.

See this answer for more details:

https://stackoverflow.com/a/11996970


There are only two valid answers that are guaranteed to be portable.

The first answer, if you have access to a system that supports C++20,
is to use std::endian from the <bit> header.

C++20 Onwards
constexpr bool is_little_endian = (std::endian::native == std::endian::little);

Prior to C++20, the only valid answer is to store an integer and then inspect its first byte through type punning. Unlike the use of unions, this is expressly allowed by C++'s type system.

It's also important to remember that for optimum portability static_cast should be used,
because reinterpret_cast is implementation defined. > If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: > ... > a char or unsigned char type.

C++11 Onwards
enum class endianness
{
	little = 0,
	big = 1,
};

inline endianness get_system_endianness()
{
	const int value { 0x01 };
	const void * address = static_cast<const void *>(&value);
	const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
	return (*least_significant_address == 0x01) ? endianness::little : endianness::big;
}
C++11 Onwards (without enum)
inline bool is_system_little_endian()
{
	const int value { 0x01 };
	const void * address = static_cast<const void *>(&value);
	const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
	return (*least_significant_address == 0x01);
}
C++98/C++03
inline bool is_system_little_endian()
{
	const int value = 0x01;
	const void * address = static_cast<const void *>(&value);
	const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
	return (*least_significant_address == 0x01);
}

Solution 10 - C++

For further details, you may want to check out this codeproject article http://www.codeproject.com/KB/cpp/endianness.aspx">Basic concepts on Endianness:

> How to dynamically test for the Endian type at run time? > > As explained in Computer > Animation FAQ, you can use the > following function to see if your code > is running on a Little- or Big-Endian > system: Collapse >
#define BIG_ENDIAN 0 #define LITTLE_ENDIAN 1

int TestByteOrder()
{
   short int word = 0x0001;
   char *byte = (char *) &word;
   return(byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

> This code assigns the value 0001h to a > 16-bit integer. A char pointer is then > assigned to point at the first > (least-significant) byte of the > integer value. If the first byte of > the integer is 0x01h, then the system > is Little-Endian (the 0x01h is in the > lowest, or least-significant, > address). If it is 0x00h then the > system is Big-Endian.

Solution 11 - C++

The C++ way has been to use boost, where preprocessor checks and casts are compartmentalized away inside very thoroughly-tested libraries.

The Predef Library (boost/predef.h) recognizes four different kinds of endianness.

The Endian Library was planned to be submitted to the C++ standard, and supports a wide variety of operations on endian-sensitive data.

As stated in answers above, Endianness will be a part of c++20.

Solution 12 - C++

Unless you're using a framework that has been ported to PPC and Intel processors, you will have to do conditional compiles, since PPC and Intel platforms have completely different hardware architectures, pipelines, busses, etc. This renders the assembly code completely different between the two.

As for finding endianness, do the following:

short temp = 0x1234;
char* tempChar = (char*)&temp;

You will either get tempChar to be 0x12 or 0x34, from which you will know the endianness.

Solution 13 - C++

As stated above, use union tricks.

There are few problems with the ones advised above though, most notably that unaligned memory access is notoriously slow for most architectures, and some compilers won't even recognize such constant predicates at all, unless word aligned.

Because mere endian test is boring, here goes (template) function which will flip the input/output of arbitrary integer according to your spec, regardless of host architecture.

#include <stdint.h>

#define BIG_ENDIAN 1
#define LITTLE_ENDIAN 0

template <typename T>
T endian(T w, uint32_t endian)
{
    // this gets optimized out into if (endian == host_endian) return w;
    union { uint64_t quad; uint32_t islittle; } t;
    t.quad = 1;
    if (t.islittle ^ endian) return w;
    T r = 0;

    // decent compilers will unroll this (gcc)
    // or even convert straight into single bswap (clang)
    for (int i = 0; i < sizeof(r); i++) {
        r <<= 8;
        r |= w & 0xff;
        w >>= 8;
    }
    return r;
};

Usage:

To convert from given endian to host, use:

host = endian(source, endian_of_source)

To convert from host endian to given endian, use:

output = endian(hostsource, endian_you_want_to_output)

The resulting code is as fast as writing hand assembly on clang, on gcc it's tad slower (unrolled &,<<,>>,| for every byte) but still decent.

Solution 14 - C++

bool isBigEndian()
{
    static const uint16_t m_endianCheck(0x00ff);
    return ( *((const uint8_t*)&m_endianCheck) == 0x0); 
}

Solution 15 - C++

I would do something like this:

bool isBigEndian() {
    static unsigned long x(1);
    static bool result(reinterpret_cast<unsigned char*>(&x)[0] == 0);
    return result;
}

Along these lines, you would get a time efficient function that only does the calculation once.

Solution 16 - C++

Declare: My initial post is incorrectly declared as "compile time". It's not, it's even impossible in current C++ standard. The constexpr does NOT means the function always do compile-time computation. Thanks Richard Hodges for correction.

compile time, non-macro, C++11 constexpr solution:

union {
  uint16_t s;
  unsigned char c[2];
} constexpr static  d {1};

constexpr bool is_little_endian() {
  return d.c[0] == 1;
}

Solution 17 - C++

untested, but in my mind, this should work? cause it'll be 0x01 on little endian, and 0x00 on big endian?

bool runtimeIsLittleEndian(void)
{
 volatile uint16_t i=1;
 return  ((uint8_t*)&i)[0]==0x01;//0x01=little, 0x00=big
}

Solution 18 - C++

union {
	int i;
	char c[sizeof(int)];
} x;
x.i = 1;
if(x.c[0] == 1)
	printf("little-endian\n");
else	printf("big-endian\n");

This is another solution. Similar to Andrew Hare's solution.

Solution 19 - C++

If you don't want conditional compilation you can just write endian independent code. Here is an example (taken from Rob Pike):

Reading an integer stored in little-endian on disk, in an endian independent manner:

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

The same code, trying to take into account the machine endianness:

i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

Solution 20 - C++

You can also do this via the preprocessor using something like boost header file which can be found boost endian

Solution 21 - C++

Unless the endian header is GCC-only, it provides macros you can use.

#include "endian.h"
...
if (__BYTE_ORDER == __LITTLE_ENDIAN) { ... }
else if (__BYTE_ORDER == __BIG_ENDIAN) { ... }
else { throw std::runtime_error("Sorry, this version does not support PDP Endian!");
...

Solution 22 - C++

See Endianness - C-Level Code illustration.

// assuming target architecture is 32-bit = 4-Bytes
enum ENDIANNESS{ LITTLEENDIAN , BIGENDIAN , UNHANDLE };


ENDIANNESS CheckArchEndianalityV1( void )
{
    int Endian = 0x00000001; // assuming target architecture is 32-bit    

    // as Endian = 0x00000001 so MSB (Most Significant Byte) = 0x00 and LSB (Least     Significant Byte) = 0x01
    // casting down to a single byte value LSB discarding higher bytes    

    return (*(char *) &Endian == 0x01) ? LITTLEENDIAN : BIGENDIAN;
} 

Solution 23 - C++

int i=1;
char *c=(char*)&i;
bool littleendian=c;

Solution 24 - C++

The way C compilers (at least everyone I know of) work the endianness has to be decided at compile time. Even for biendian processors (like ARM och MIPS) you have to choose endianness at compile time. Further more the endianness is defined in all common file formats for executables (such as ELF). Although it is possible to craft a binary blob of biandian code (for some ARM server exploit maybe?) it probably has to be done in assembly.

Solution 25 - C++

How about this?

#include <cstdio>

int main()
{
	unsigned int n = 1;
	char *p = 0;

	p = (char*)&n;
	if (*p == 1)
		std::printf("Little Endian\n");
	else 
		if (*(p + sizeof(int) - 1) == 1)
			std::printf("Big Endian\n");
		else
			std::printf("What the crap?\n");
	return 0;
}

Solution 26 - C++

Here's another C version. It defines a macro called wicked_cast() for inline type punning via C99 union literals and the non-standard __typeof__ operator.

#include <limits.h>

#if UCHAR_MAX == UINT_MAX
#error endianness irrelevant as sizeof(int) == 1
#endif

#define wicked_cast(TYPE, VALUE) \
	(((union { __typeof__(VALUE) src; TYPE dest; }){ .src = VALUE }).dest)

_Bool is_little_endian(void)
{
	return wicked_cast(unsigned char, 1u);
}

If integers are single-byte values, endianness makes no sense and a compile-time error will be generated.

Solution 27 - C++

while there is no quick and standard way to determine it, this will output it:

#include <stdio.h> 
int main()  
{ 
   unsigned int i = 1; 
   char *c = (char*)&i; 
   if (*c)     
       printf("Little endian"); 
   else
       printf("Big endian"); 
   getchar(); 
   return 0; 
} 

Solution 28 - C++

As pointed out by Coriiander, most (if not all) of those codes here will be optimized away at compilation time, so the generated binaries won't check "endianness" at run time.

It has been observed that a given executable shouldn't run in two different byte orders, but I have no idea if that is always the case, and it seems like a hack to me checking at compilation time. So I coded this function:

#include <stdint.h>

int* _BE = 0;

int is_big_endian() {
    if (_BE == 0) {
        uint16_t* teste = (uint16_t*)malloc(4);
        *teste = (*teste & 0x01FE) | 0x0100;
        uint8_t teste2 = ((uint8_t*) teste)[0];
        free(teste);
        _BE = (int*)malloc(sizeof(int));
        *_BE = (0x01 == teste2);
    }
    return *_BE;
}

MinGW wasn't able to optimize this code, even though it does optimize the other codes here away. I believe that is because I leave the "random" value that was alocated on the smaller byte memory as it was (at least 7 of its bits), so the compiler can't know what that random value is and it doesn't optimize the function away.

I've also coded the function so that the check is only performed once, and the return value is stored for next tests.

Solution 29 - C++

I was going through the textbook:Computer System: a programmer's perspective, and there is a problem to determine which endian is this by C program.

I used the feature of the pointer to do that as following:

#include <stdio.h>

int main(void){
    int i=1;
    unsigned char* ii = &i;

    printf("This computer is %s endian.\n", ((ii[0]==1) ? "little" : "big"));
    return 0;
}

As the int takes up 4 bytes, and char takes up only 1 bytes. We could use a char pointer to point to the int with value 1. Thus if the computer is little endian, the char that char pointer points to is with value 1, otherwise, its value should be 0.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJay TView Question on Stackoverflow
Solution 1 - C++David CournapeauView Answer on Stackoverflow
Solution 2 - C++user3624760View Answer on Stackoverflow
Solution 3 - C++Eric PetroeljeView Answer on Stackoverflow
Solution 4 - C++Andrew HareView Answer on Stackoverflow
Solution 5 - C++billView Answer on Stackoverflow
Solution 6 - C++DaveRView Answer on Stackoverflow
Solution 7 - C++CoriianderView Answer on Stackoverflow
Solution 8 - C++sharptoothView Answer on Stackoverflow
Solution 9 - C++PharapView Answer on Stackoverflow
Solution 10 - C++noneView Answer on Stackoverflow
Solution 11 - C++fuzzyTewView Answer on Stackoverflow
Solution 12 - C++samozView Answer on Stackoverflow
Solution 13 - C++katView Answer on Stackoverflow
Solution 14 - C++Paolo BrandoliView Answer on Stackoverflow
Solution 15 - C++Jeremy MayhewView Answer on Stackoverflow
Solution 16 - C++zhaorufeiView Answer on Stackoverflow
Solution 17 - C++hanshenrikView Answer on Stackoverflow
Solution 18 - C++NeerajView Answer on Stackoverflow
Solution 19 - C++fjardonView Answer on Stackoverflow
Solution 20 - C++nmushellView Answer on Stackoverflow
Solution 21 - C++Mark A. LibbyView Answer on Stackoverflow
Solution 22 - C++gimelView Answer on Stackoverflow
Solution 23 - C++Jon BrightView Answer on Stackoverflow
Solution 24 - C++FabelView Answer on Stackoverflow
Solution 25 - C++AbhayView Answer on Stackoverflow
Solution 26 - C++ChristophView Answer on Stackoverflow
Solution 27 - C++yekanchiView Answer on Stackoverflow
Solution 28 - C++Tex KillerView Answer on Stackoverflow
Solution 29 - C++Archimedes520View Answer on Stackoverflow