Why does the order of '-l' option in gcc matter?

CGccLinkerLd

C Problem Overview


I am trying to compile a program which uses udis86 library. Actually I am using an example program given in the user-manual of the library. But while compiling, it gives error. The errors I get are:

example.c:(.text+0x7): undefined reference to 'ud_init'
example.c:(.text+0x7): undefined reference to 'ud_set_input_file'
.
.
example.c:(.text+0x7): undefined reference to 'ud_insn_asm'

The command I am using is:

$ gcc -ludis86 example.c -o example 

as instructed in the user-manual.

Clearly, linker is not able to link libudis library. But if I change my command to:

$ gcc example.c -ludis86 -o example 

It starts working. So can please someone explain what is the problem with the first command?

C Solutions


Solution 1 - C

Because that's how the linking algorithm used by GNU linker works (a least when it comes to linking static libraries). The linker is a single pass linker and it does not revisit libraries once they have been seen.

A library is a collection (an archive) of object files. When you add a library using the -l option, the linker does not unconditionally take all object files from the library. It only takes those object files that are currently needed, i.e. files that resolve some currently unresolved (pending) symbols. After that, the linker completely forgets about that library.

The list of pending symbols is continuously maintained by the linker as the linker processes input object files, one after another from left to right. As it processes each object file, some symbols get resolved and removed from the list, other newly discovered unresolved symbols get added to the list.

So, if you included some library by using -l, the linker uses that library to resolve as many currently pending symbols as it can, and then completely forgets about that library. If it later suddenly discovers that it now needs some additional object file(s) from that library, the linker will not "return" to that library to retrieve those additional object files. It is already too late.

For this reason, it is always a good idea to use -l option late in the linker's command line, so that by the time the linker gets to that -l it can reliably determine which object files it needs and which it doesn't need. Placing the -l option as the very first parameter to the linker generally makes no sense at all: at the very beginning the list of pending symbols is empty (or, more precisely, consists of single symbol main), meaning that the linker will not take anything from the library at all.

In your case, your object file example.o contains references to symbols ud_init, ud_set_input_file etc. The linker should receive that object file first. It will add these symbols to the list of pending symbols. After that you can use -l option to add the your library: -ludis86. The linker will search your library and take everything from it that resolves those pending symbols.

If you place the -ludis86 option first in the command line, the linker will effectively ignore your library, since at the beginning it does not know that it will need ud_init, ud_set_input_file etc. Later, when processing example.o it will discover these symbols and add them to the pending symbol list. But these symbols will remain unresolved to the end, since -ludis86 was already processed (and effectively ignored).

Sometimes, when two (or more) libraries refer to each other in circular fashion, one might even need to use the -l option twice with the same library, to give linker two chances to retrieve the necessary object files from that library.

Solution 2 - C

I hit this same issue a while back. Bottom line is that gnu tools won't always "search back" in the library list to resolve missing symbols. Easy fixes are any of the following:

  1. Just specify the libs and objs in the dependency order (as you have discovered above)

  2. OR if you have a circular dependency (where libA references a function in libB, but libB reference a function in libA), then just specify the libs on the command line twice. This is what the manual page suggests as well. E.g.

     gcc foo.c -lfoo -lbar -lfoo
    
  3. Use the -( and -) params to specify a group of archives that have such circular dependencies. Look at the GNU linker manual for --start-group and --end-group. See here for more details.

When you use option 2 or 3, you're likely introducing a performance cost for linking. If your don't have that much to link, it may not matter.

Solution 3 - C

Or use rescan

from pg 41 of Oracle Solaris 11.1 Linkers and Libraries Guide:

> Interdependencies between archives can exist, such that the extraction > of members from one archive must be resolved by extracting members > from another archive. If these dependencies are cyclic, the archives > must be specified repeatedly on the command line to satisfy previous > references.

$ cc -o prog .... -lA -lB -lC -lA -lB -lC -lA 

>The determination, and maintenance, of repeated archive specifications can > be tedious.

>The > -z rescan-now option makes this process simpler. The -z rescan-now option is processed by the link-editor immediately when the option is > encountered on the command line. All archives that have been processed > from the command line prior to this option are immediately > reprocessed. This processing attempts to locate additional archive > members that resolve symbol references. This archive rescanning > continues until a pass over the archive list occurs in which no new > members are extracted. The previous example can be simplified as > follows.

$ cc -o prog .... -lA -lB -lC -z rescan-now 

> Alternatively, the -z rescan-start and -z rescan-end options can be used to group > mutually dependent archives together into an archive group. These > groups are reprocessed by the link-editor immediately when the closing > delimiter is encountered on the command line. Archives found within > the group are reprocessed in an attempt to locate additional archive > members that resolve symbol references. This archive rescanning > continues until a pass over the archive group occurs in which no new > members are extracted. Using archive groups, the previous example can > be written as follows.

$ cc -o prog .... -z rescan-start -lA -lB -lC -z rescan-end

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1129237View Question on Stackoverflow
Solution 1 - CAnTView Answer on Stackoverflow
Solution 2 - CselbieView Answer on Stackoverflow
Solution 3 - CflerbView Answer on Stackoverflow