Is it feasible to compile Python to machine code?

PythonCLinkerCompilation

Python Problem Overview


How feasible would it be to compile Python (possibly via an intermediate C representation) into machine code?

Presumably it would need to link to a Python runtime library, and any parts of the Python standard library which were Python themselves would need to be compiled (and linked in) too.

Also, you would need to bundle the Python interpreter if you wanted to do dynamic evaluation of expressions, but perhaps a subset of Python that didn't allow this would still be useful.

Would it provide any speed and/or memory usage advantages? Presumably the startup time of the Python interpreter would be eliminated (although shared libraries would still need loading at startup).

Python Solutions


Solution 1 - Python

As @Greg Hewgill says it, there are good reasons why this is not always possible. However, certain kinds of code (like very algorithmic code) can be turned into "real" machine code.

There are several options:

  • Use Psyco, which emits machine code dynamically. You should choose carefully which methods/functions to convert, though.
  • Use Cython, which is a Python-like language that is compiled into a Python C extension
  • Use PyPy, which has a translator from RPython (a restricted subset of Python that does not support some of the most "dynamic" features of Python) to C or LLVM.
    • PyPy is still highly experimental
    • not all extensions will be present

After that, you can use one of the existing packages (freeze, Py2exe, PyInstaller) to put everything into one binary.

All in all: there is no general answer for your question. If you have Python code that is performance-critical, try to use as much builtin functionality as possible (or ask a "How do I make my Python code faster" question). If that doesn't help, try to identify the code and port it to C (or Cython) and use the extension.

Solution 2 - Python

Try ShedSkin Python-to-C++ compiler, but it is far from perfect. Also there is Psyco - Python JIT if only speedup is needed. But IMHO this is not worth the effort. For speed-critical parts of code best solution would be to write them as C/C++ extensions.

Solution 3 - Python

Nuitka is a Python to C++ compiler that links against libpython. It appears to be a relatively new project. The author claims a speed improvement over CPython on the pystone benchmark.

Solution 4 - Python

PyPy is a project to reimplement Python in Python, using compilation to native code as one of the implementation strategies (others being a VM with JIT, using JVM, etc.). Their compiled C versions run slower than CPython on average but much faster for some programs.

Shedskin is an experimental Python-to-C++ compiler.

Pyrex is a language specially designed for writing Python extension modules. It's designed to bridge the gap between the nice, high-level, easy-to-use world of Python and the messy, low-level world of C.

Solution 5 - Python

Pyrex is a subset of the Python language that compiles to C, done by the guy that first built list comprehensions for Python. It was mainly developed for building wrappers but can be used in a more general context. Cython is a more actively maintained fork of pyrex.

Solution 6 - Python

Some extra references:

Solution 7 - Python

Jython has a compiler targeting JVM bytecode. The bytecode is fully dynamic, just like the Python language itself! Very cool. (Yes, as Greg Hewgill's answer alludes, the bytecode does use the Jython runtime, and so the Jython jar file must be distributed with your app.)

Solution 8 - Python

Psyco is a kind of just-in-time (JIT) compiler: dynamic compiler for Python, runs code 2-100 times faster, but it needs much memory.

In short: it run your existing Python software much faster, with no change in your source but it doesn't compile to object code the same way a C compiler would.

Solution 9 - Python

The answer is "Yes, it is possible". You could take Python code and attempt to compile it into the equivalent C code using the CPython API. In fact, there used to be a Python2C project that did just that, but I haven't heard about it in many years (back in the Python 1.5 days is when I last saw it.)

You could attempt to translate the Python code into native C as much as possible, and fall back to the CPython API when you need actual Python features. I've been toying with that idea myself the last month or two. It is, however, an awful lot of work, and an enormous amount of Python features are very hard to translate into C: nested functions, generators, anything but simple classes with simple methods, anything involving modifying module globals from outside the module, etc, etc.

Solution 10 - Python

This doesn't compile Python to machine code. But allows to create a shared library to call Python code.

If what you are looking for is an easy way to run Python code from C without relying on execp stuff. You could generate a shared library from python code wrapped with a few calls to Python embedding API. Well the application is a shared library, an .so that you can use in many other libraries/applications.

Here is a simple example which create a shared library, that you can link with a C program. The shared library executes Python code.

The python file that will be executed is pythoncalledfromc.py:

# -*- encoding:utf-8 -*-
# this file must be named "pythoncalledfrom.py"

def main(string):  # args must a string
    print "python is called from c"
    print "string sent by «c» code is:"
    print string
    print "end of «c» code input"
    return 0xc0c4  # return something

You can try it with python2 -c "import pythoncalledfromc; pythoncalledfromc.main('HELLO'). It will output:

python is called from c
string sent by «c» code is:
HELLO
end of «c» code input

The shared library will be defined by the following by callpython.h:

#ifndef CALL_PYTHON
#define CALL_PYTHON

void callpython_init(void);
int callpython(char ** arguments);
void callpython_finalize(void);

#endif

The associated callpython.c is:

// gcc `python2.7-config --ldflags` `python2.7-config --cflags` callpython.c -lpython2.7 -shared -fPIC -o callpython.so

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <python2.7/Python.h>

#include "callpython.h"

#define PYTHON_EXEC_STRING_LENGTH 52
#define PYTHON_EXEC_STRING "import pythoncalledfromc; pythoncalledfromc.main(\"%s\")"


void callpython_init(void) {
     Py_Initialize();
}

int callpython(char ** arguments) {
  int arguments_string_size = (int) strlen(*arguments);
  char * python_script_to_execute = malloc(arguments_string_size + PYTHON_EXEC_STRING_LENGTH);
  PyObject *__main__, *locals;
  PyObject * result = NULL;

  if (python_script_to_execute == NULL)
    return -1;

  __main__ = PyImport_AddModule("__main__");
  if (__main__ == NULL)
    return -1;

  locals = PyModule_GetDict(__main__);

  sprintf(python_script_to_execute, PYTHON_EXEC_STRING, *arguments);
  result = PyRun_String(python_script_to_execute, Py_file_input, locals, locals);
  if(result == NULL)
    return -1;
  return 0;
}

void callpython_finalize(void) {
  Py_Finalize();
}

You can compile it with the following command:

gcc `python2.7-config --ldflags` `python2.7-config --cflags` callpython.c -lpython2.7 -shared -fPIC -o callpython.so

Create a file named callpythonfromc.c that contains the following:

#include "callpython.h"

int main(void) {
  char * example = "HELLO";
  callpython_init();
  callpython(&example);
  callpython_finalize();
  return 0;
}

Compile it and run:

gcc callpythonfromc.c callpython.so -o callpythonfromc
PYTHONPATH=`pwd` LD_LIBRARY_PATH=`pwd` ./callpythonfromc

This is a very basic example. It can work, but depending on the library it might be still difficult to serialize C data structures to Python and from Python to C. Things can be automated somewhat...

Nuitka might be helpful.

Also there is numba but they both don't aim to do what you want exactly. Generating a C header from Python code is possible, but only if you specify the how to convert the Python types to C types or can infer that information. See python astroid for a Python ast analyzer.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAndy BalaamView Question on Stackoverflow
Solution 1 - PythonTorsten MarekView Answer on Stackoverflow
Solution 2 - PythonclegView Answer on Stackoverflow
Solution 3 - PythonbcattleView Answer on Stackoverflow
Solution 4 - PythonpdcView Answer on Stackoverflow
Solution 5 - PythonConcernedOfTunbridgeWellsView Answer on Stackoverflow
Solution 6 - Pythonserge-sans-pailleView Answer on Stackoverflow
Solution 7 - PythonChris Jester-YoungView Answer on Stackoverflow
Solution 8 - PythonPierre-Jean CoudertView Answer on Stackoverflow
Solution 9 - PythonThomas WoutersView Answer on Stackoverflow
Solution 10 - PythonamiroucheView Answer on Stackoverflow