How to create a C compiler for custom CPU?

CCompiler ConstructionCustomization

C Problem Overview


What would be the easiest way to create a C compiler for a custom CPU, assuming of course I already have an assembler for it?

Since a C compiler generates assembly, is there some way to just define standard bits and pieces of assembly code for the various C idioms, rebuild the compiler, and thereby obtain a cross compiler for the target hardware?

Preferably the compiler itself would be written in C, and build as a native executable for either Linux or Windows.

Please note: I am not asking how to write the compiler itself. I did take that course in college, I know about general compiler-compilers, etc. In this situation, I'd just like to configure some existing framework if at all possible. I don't want to modify the language, I just want to be able to target an arbitrary architecture. If the answer turns out to be "it doesn't work that way", that information will be useful to myself and anyone else who might make similar assumptions.

C Solutions


Solution 1 - C

Quick overview/tutorial on writing a LLVM backend.

> This document describes techniques for writing backends for LLVM which convert the LLVM representation to machine assembly code or other languages. > >[ . . . ] > > To create a static compiler (one that emits text assembly), you need to implement the following: > > - Describe the register set. > - Describe the instruction set. > - Describe the target machine. > - Implement the assembly printer for the architecture. > - Implement an instruction selector for the architecture.

Solution 2 - C

There's the concept of a cross-compiler, ie., one that runs on one architecture, but targets a different one. You can see how GCC does it (for example) and add a new architecture to the set, if that's the compiler you want to extend.

Edit: I just spotted a question a few years ago on a GCC mailing list on how to add a new target and someone pointed to this

Solution 3 - C

The short answer is that it doesn't work that way.

The longer answer is that it does take some effort to write a compiler for a new CPU type. You don't need to create a compiler from scratch, however. Most compilers are structured in several passes; here's a typical architecture (a lot of variations are possible):

  1. Syntactic analysis (lexer and parser), and for C preprocessing, leading to an abstract syntax tree.
  2. Type checking, leading to an annotated abstract syntax tree.
  3. Intermediate code generation, leading to architecture-independent intermediate code. Some optimizations are performed at this stage.
  4. Machine code generation, leading to assembly or directly to machine code. More optimizations are performed at this stage.

In this description, only step 4 is machine-dependent. So you can take a compiler where step 4 is clearly separated and plug in your own step 4. Doing this requires a deep understanding of the CPU and some understanding of the compiler internals, but you don't need to worry about what happens before.

Almost all CPUs that are not very small, very rare or very old have a backend (step 4) for GCC. The main documentation for writing a GCC backend is the GCC internals manual, in particular the chapters on machine descriptions and target descriptions. GCC is free software, so there is no licensing cost in using it.

Solution 4 - C

vbcc (at www.compilers.de) is a good and simple retargetable C-compiler written in C. It's much simpler than GCC/LLVM. It's so simple I was able to retarget the compiler to my own CPU with a few weeks of work without having any prior knowledge of compilers.

Solution 5 - C

  1. Short answer:

    "No. There's no such thing as a "compiler framework" where you can just add water (plug in your own assembly set), stir, and it's done."

  2. Longer answer: it's certainly possible. But challenging. And likely expensive.

    If you wanted to do it yourself, I'd start by looking at Gnu CC. It's already available for a large variety of CPUs and platforms.

  3. Take a look at this link for more ideas (including the idea of "just build a library of functions and macros"), that would be my first suggestion:

http://www.instructables.com/answers/Custom-C-Compiler-for-homemade-instruction-set/

Solution 6 - C

You can modify existing open source compilers such as GCC or Clang. Other answers have provided you with links about where to learn more. But these compilers are not designed to easily retargeted; they are "easier" to retarget than compilers than other compilers wired for specific targets.

But if you want a compiler that is relatively easy to retarget, you want one in which you can specify the machine architecture in explicit terms, and some tool generates the rest of the compiler (GCC does a bit of this; I don't think Clang/LLVM does much but I could be wrong here).

There's a lot of this in the literature, google "compiler-compiler".

But for a concrete solution for C, you should check out http://www.ace.nl/">ACE</a>;, a compiler vendor that generates compilers on demand for customers. Not free, but I hear they produce very good compilers very quickly. I think it produces standard style binaries (ELF?) so it skips the assembler stage. (I have no experience or relationship with ACE.)

If you don't care about code quality, you can likely write a syntax-directed translation of C to assembler using a C AST. You can get C ASTs from GCC, Clang, maybe ANTLR, and from our http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html">DMS Software Reengineering Toolkit.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJustJeffView Question on Stackoverflow
Solution 1 - CPubbyView Answer on Stackoverflow
Solution 2 - CRicardo CárdenesView Answer on Stackoverflow
Solution 3 - CGilles 'SO- stop being evil'View Answer on Stackoverflow
Solution 4 - CdsulaView Answer on Stackoverflow
Solution 5 - Cpaulsm4View Answer on Stackoverflow
Solution 6 - CIra BaxterView Answer on Stackoverflow