NASM Vs GAS (Practical differences)

AssemblyX86NasmGnu AssemblerAtt

Assembly Problem Overview


I'm not trying to prompt an Intel vs AT&T war (moot point anyway, now that they both support Intel syntax) or ask which one is "better" per se, I just want to know the practical differences in choosing one or the other.

Basically, when I was picking up some basic x86 assembly a few years back, I used NASM for no reason other than the book I was reading did too -- which put me firmly but involuntarily in the NASM camp. Since then, I've had very few causes to use assembly so I haven't had the opportunity to try GAS.

Bearing in mind that they both support Intel syntax (which I personally prefer) and should, theoretically at least, produce the same binary (I know they probably won't but the meaning shouldn't be changed), what are the reasons to favour one or the other?

Is it command line options? Macros? Non-mnemonic keywords? Or something else?

Thanks :)

Assembly Solutions


Solution 1 - Assembly

NASM actually uses its own variation of Intel syntax, different from the MASM syntax used in Intel's official documentation. The opcode names and operand orders are the same as in Intel so the instructions look the same at first glance, but any significant program will have differences. For example with MASM the instruction used by MOV ax, foo depends on the type of foo, while NASM doesn't have types and this always assembles to a move immediate instruction. When the size of an operand can't be determined implicitly MASM requires something like DWORD PTR to be used where NASM uses DWORD to mean the same thing. Most of the syntax beyond the instruction mnemonics and basic operand format and ordering is different.

In terms of functionality NASM and GAS are pretty much the same. Both have assembler macro facilities, though NASM's is more extensive and more mature. Many GAS source code files use the C preprocessor instead of GAS's own macro support.

The biggest difference between the two assemblers is their support for 16-bit code. GAS doesn't have any support for defining x86 segments. With GAS you're limited to creating simple single-segment 16-bit binary images, basically just boot sectors and .COM files. NASM has full support for segments and supports OMF format object files which you can use with a suitable linker to create segmented 16-bit executables.

In addition to the OMF object file format, NASM supports a number of formats that GAS doesn't. GAS normally only supports the native format for the machine its running on, basically ELF, PE-COFF, or MACH-O. If you want to support a different format you need to build a "cross-compiling" version of GAS for that format.

Another notable difference is that GAS has support for creating DWARF and Windows 64-bit unwind information (the later required by the Windows x64 ABI) while with NASM you have to create the sections and fill in the data yourself.

Solution 2 - Assembly

Intel Syntax: mov eax, 1 (instruction destination, source)

AT&T Syntax: movl $1, %eax (instruction source, destination)

The Intel syntax is pretty self explanatory. In the above example, the amount of data which is moved is inferred from the size of the register (32 bits in the case of eax). The addressing mode used is inferred from the operands themselves.

There are some quirks when it comes to the AT&T syntax. Firstly, notice the l suffix at the end of the mov instruction, this stands for long and signifies 32 bits of data. Other instruction suffixes include w for a word (16 bits - not to be confused with the word size of your CPU!), q for a quad-word (64 bits) and b for a single byte. Whilst not always required, typically you will see assembly code which uses AT&T syntax explicitly state the amount of data being operated on by the instruction.

More explicitness is required when it comes to the addressing mode used on the source and destination operand. $ signifies immediate addressing, as in use the value in the instruction itself. In the above example, if it was written without this $, direct addressing would be used i.e. the CPU would try and fetch the value at memory address 1 (which will more than likely result in a segmentation fault). The % signifies register addressing, if you didn't include this in the above example eax would be treated as a symbol i.e. a labelled memory address, which would more than likely result in an undefined reference at link time. So it is mandatory that you are explicit about the addressing mode used on both the source and destination operand.

The way memory operands are specified is also different:

Intel: [base register + index * size of index + offset]

AT&T: offset(base register, index, size of index)

The Intel syntax makes it a little more clear what calculation is taking place to find the memory address. With the AT&T syntax, the result is the same but you are expected to know the calculation taking place.

>should, theoretically at least, produce the same binary

This is entirely dependent on your toolchain.

>what are the reasons to favour one or the other?

Personal preference of course, in my opinion it comes down to which syntax you feel more comfortable with when addressing memory. Do you prefer the forced explicitness of the AT&T syntax? Or do you prefer your assembler figuring out this low level minutia for you?

>Is it command line options? Macros? Non-mnemonic keywords?

This has to do with the assembler (GAS, NASM) itself. Again, personal preference.

Solution 3 - Assembly

why not check this post?

> One of the biggest differences between NASM and GAS is the syntax. GAS uses the AT&T syntax, a relatively archaic syntax that is specific to GAS and some older assemblers, whereas NASM uses the Intel syntax, supported by a majority of assemblers such as TASM and MASM. (Modern versions of GAS do support a directive called .intel_syntax, which allows the use of Intel syntax with GAS.)

It covers:

  • Basic syntactical differences between NASM and GAS
  • Common assembly-level constructs such as variables, loops, labels, and macros
  • A bit about calling external C routines and using functions
  • Assembly mnemonic differences and usage
  • Memory addressing methods

A good practice is to write hello_world in both dialects and have a concrete feeling.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionElliottView Question on Stackoverflow
Solution 1 - AssemblyRoss RidgeView Answer on Stackoverflow
Solution 2 - Assemblyuname01View Answer on Stackoverflow
Solution 3 - AssemblyIzanaView Answer on Stackoverflow