Is using an outdated C compiler a security risk?

CSecurityGcc

C Problem Overview


We have some build systems in production which no one cares about and these machines run ancient versions of GCC like GCC 3 or GCC 2.

And I can't persuade the management to upgrade it to a more recent: they say, "if ain't broke, don't fix it".

Since we maintain a very old code base (written in the 80s), this C89 code compiles just fine on these compilers.

But I'm not sure it is good idea to use these old stuff.

My question is:

Can using an old C compiler compromise the security of the compiled program?

UPDATE:

The same code is built by Visual Studio 2008 for Windows targets, and MSVC doesn't support C99 or C11 yet (I don't know if newer MSVC does), and I can build it on my Linux box using the latest GCC. So if we would just drop in a newer GCC it would probably build just as fine as before.

C Solutions


Solution 1 - C

Actually I would argue the opposite.

There are a number of cases where behaviour is undefined by the C standard but where it is obvious what would happen with a "dumb compiler" on a given platform. Cases like allowing a signed integer to overflow or accessing the same memory though variables of two different types.

Recent versions of gcc (and clang) have started treating these cases as optimisation opportunities not caring if they change how the binary behaves in the "undefined behaviour" condition. This is very bad if your codebase was written by people who treated C like a "portable assembler". As time went on the optimisers have started looking at larger and larger chunks of code when doing these optimisations increasing the chance the binary will end up doing something other than "what a binary built by a dumb compiler" would do.

There are compiler switches to restore "traditional" behaviour (-fwrapv and -fno-strict-aliasing for the two I mentioned above) , but first you have to know about them.

While in principle a compiler bug could turn compliant code into a security hole I would consider the risk of this to be negligable in the grand scheme of things.

Solution 2 - C

There are risks in both courses of action.


Older compilers have the advantage of maturity, and whatever was broken in them has probably (but there's no guarantee) been worked around successfully.

In this case, a new compiler is a potential source of new bugs.


On the other hand, newer compilers come with additional tooling:

  • GCC and Clang both now feature sanitizers which can instrument the runtime to detect undefined behaviors of various sorts (Chandler Carruth, of the Google Compiler team, claimed last year that he expects them to have reached full coverage)
  • Clang, at least, features hardening, for example Control Flow Integrity is about detecting hi-jacks of control flow, there are also hardening implements to protect against stack smashing attacks (by separating the control-flow part of the stack from the data part); hardening features are generally low overhead (< 1% CPU overhead)
  • Clang/LLVM is also working on libFuzzer, a tool to create instrumented fuzzing unit-tests that explore the input space of the function under test smartly (by tweaking the input to take not-as-yet explored execution paths)

Instrumenting your binary with the sanitizers (Address Sanitizer, Memory Sanitizer or Undefined Behavior Sanitizer) and then fuzzing it (using American Fuzzy Lop for example) has uncovered vulnerabilities in a number of high-profile softwares, see for example this LWN.net article.

Those new tools, and all future tools, are inaccessible to you unless you upgrade your compiler.

By staying on an underpowered compiler, you are putting your head in the sand and crossing fingers that no vulnerability is found. If your product is a high-value target, I urge you to reconsider.


Note: even if you do NOT upgrade the production compiler, you might want to use a new compiler to check for vulnerability anyway; do be aware that since those are different compilers, the guarantees are lessened though.

Solution 3 - C

Your compiled code contains bugs that could be exploited. The bugs come from three sources: Bugs in your source code, bugs in the compiler and libraries, and undefined behaviour in your source code that the compiler turns into a bug. (Undefined behaviour is a bug, but not a bug in the compiled code yet. As an example, i = i++; in C or C++ is a bug, but in your compiled code it may increase i by 1 and be Ok, or set i to some junk and be a bug).

The rate of bugs in your compiled code is presumably low due to testing and to fixing bugs due to customer bug reports. So there may have been a large number of bugs initially, but that has gone down.

If you upgrade to a newer compiler, you may lose bugs that were introduced by compiler bugs. But these bugs would all be bugs that to your knowledge nobody found and nobody exploited. But the new compiler may have bugs on its own, and importantly newer compilers have a stronger tendency to turn undefined behaviour into bugs in the compiled code.

So you will have a whole lot of new bugs in your compiled code; all bugs that hackers could find and exploit. And unless you do a whole lot of testing, and leave your code with customers to find bugs for a long time, it will be less secure.

Solution 4 - C

If it aint broke, don't fix it

Your boss sounds right in saying this, however, the more important factor, is safeguarding of inputs, outputs, buffer overflows. Lack of those is invariably the weakest link in the chain from that standpoint regardless of the compiler used.

However, if the code base is ancient, and work was put in place to mitigate the weaknesses of the K&R C used, such as lacking of type safety, insecure fgets, etc, weigh up the question "Would upgrading the compiler to more modern C99/C11 standards break everything?"

Provided, that there's a clear path to migrate to the newer C standards, which could induce side effects, might be best to attempt a fork of the old codebase, assess it and put in extra type checks, sanity checks, and determine if upgrading to the newer compiler has any effect on input/output datasets.

Then you can show it to your boss, "Here's the updated code base, refactored, more in line with industry accepted C99/C11 standards...".

That's the gamble that would have to be weighed up on, very carefully, resistence to change might show there in that environment and may refuse to touch the newer stuff.

EDIT

Just sat back for a few minutes, realized this much, K&R generated code could be running on a 16bit platform, chances are, upgrading to more modern compiler could actually break the code base, am thinking in terms of architecture, 32bit code would be generated, this could have funny side effects on the structures used for input/output datasets, that is another huge factor to weigh up carefully.

Also, since OP has mentioned using Visual Studio 2008 to build the codebase, using gcc could induce bringing into the environment either MinGW or Cygwin, that could have an impact change on the environment, unless, the target is for Linux, then it would be worth a shot, may have to include additional switches to the compiler to minimize noise on old K&R code base, the other important thing is to carry out a lot of testing to ensure no functionality is broken, may turn out to be a painful exercise.

Solution 5 - C

There is a security risk where a malicious developer can sneak a back-door through a compiler bug. Depending on the quantity of known bugs in the compiler in use, the backdoor may look more or less inconspicuous (in any case, the point is that the code is correct, even if convoluted, at the source level. Source code reviews and tests using a non-buggy compiler will not find the backdoor, because the backdoor does not exist in these conditions). For extra deniability points, the malicious developer may also look for previously-unknown compiler bugs on their own. Again, the quality of the camouflage will depend on the choice of compiler bugs found.

This attack is illustrated on the program sudo in this article. bcrypt wrote a great follow-up for Javascript minifiers.

Apart from this concern, the evolution of C compilers has been to exploit undefined behavior more and more and more aggressively, so old C code that was written in good faith would actually be more secure compiled with a C compiler from the time, or compiled at -O0 (but some new program-breaking UB-exploiting optimizations are introduced in new versions of compilers even at -O0).

Solution 6 - C

> Can using an old C compiler compromise the security of the compiled program?

Of course it can, if the old compiler contains known bugs that you know would affect your program.

The question is, does it? To know for sure, you would have to read the whole change log from your version to present date and check every single bug fixed over the years.

If you find no evidence of compiler bugs that would affect your program, updating GCC just for the sake of it seems a bit paranoid. You would have to keep in mind that newer versions might contain new bugs, that are not yet discovered. Lots of changes were made recently with GCC 5 and C11 support.

That being said, code written in the 80s is most likely already filled to the brim with security holes and reliance on poorly-defined behavior, no matter the compiler. We're talking of pre-standard C here.

Solution 7 - C

Older compilers may not have protection against known hacking attacks. Stack smashing protection, for example, was not introduced until GCC 4.1. So yeah, code compiled with older compilers may be vulnerable in ways that newer compilers protect against.

Solution 8 - C

Another aspect to worry about is the development of new code.

Older compilers may have different behavior for some language features than what is standardized and expected by the programmer. This mismatch can slow development and introduce subtle bugs that can be exploited.

Older compilers offer fewer features (including language features!) and don't optimize as well. Programmers will hack their way around these deficiencies — e.g. by reimplementing missing features, or writing clever code that is obscure but runs faster — creating new opportunities for the creation of subtle bugs.

Solution 9 - C

Nope

The reason is simple, old compiler may have old bugs and exploits, but the new compiler will have new bugs and exploits.

Your not "fixing" any bugs by upgrading to a new compiler. Your switching old bugs and exploits for new bugs and exploits.

Solution 10 - C

Well there is a higher probability that any bugs in the old compiler are well known and documented as opposed to using a new compiler so actions can be taken to avoid those bugs by coding around them. So in a way that is not enough as argument for upgrading. We have the same discussions where I work, we use GCC 4.6.1 on a code base for embedded software and there is a great reluctance (among management) to upgrade to the latest compiler because of fear for new, undocumented bugs.

Solution 11 - C

Your question falls into two parts:

  • Explicit: “Is there a greater risk in using the older compiler” (more or less as in your title)
  • Implicit: “How can I persuade management to upgrade”

Perhaps you can answer both by finding an exploitable flaw in your existing code base and showing that a newer compiler would have detected it. Of course your management may say “you found that with the old compiler”, but you can point out that it cost considerable effort. Or you run it through the new compiler to find the vulnerability, then exploit it, if your are able/allowed to compile the code with the new compiler. You may want help from a friendly hacker, but that depends on trusting them and being able/allowed to show them the code (and use the new compiler).

But if your system is not exposed to hackers, you should perhaps be more interested in whether a compiler upgrade would increase your effectiveness: MSVS 2013 Code Analysis quite often finds potential bugs much sooner than MSVS 2010, and it more or less supports C99/C11 – not sure if it does officially, but declarations can follow statements and you can declare variables in for-loops.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCalmariusView Question on Stackoverflow
Solution 1 - CplugwashView Answer on Stackoverflow
Solution 2 - CMatthieu M.View Answer on Stackoverflow
Solution 3 - Cgnasher729View Answer on Stackoverflow
Solution 4 - Ct0mm13bView Answer on Stackoverflow
Solution 5 - CPascal CuoqView Answer on Stackoverflow
Solution 6 - CLundinView Answer on Stackoverflow
Solution 7 - CDrMcCleodView Answer on Stackoverflow
Solution 8 - Cuser1084944View Answer on Stackoverflow
Solution 9 - CcoteyrView Answer on Stackoverflow
Solution 10 - CAndersKView Answer on Stackoverflow
Solution 11 - CPJTraillView Answer on Stackoverflow