Python equivalence to inline functions or macros

PythonOptimizationInline Functions

Python Problem Overview


I just realized that doing

x.real*x.real+x.imag*x.imag

is three times faster than doing

abs(x)**2

where x is a numpy array of complex numbers. For code readability, I could define a function like

def abs2(x):
    return x.real*x.real+x.imag*x.imag

which is still far faster than abs(x)**2, but it is at the cost of a function call. Is it possible to inline such a function, as I would do in C using macro or using inline keyword?

Python Solutions


Solution 1 - Python

>Is it possible to inline such a function, as I would do in C using macro or using inline keyword?

No. Before reaching this specific instruction, Python interpreters don't even know if there's such a function, much less what it does.

As noted in comments, PyPy will inline automatically (the above still holds - it "simply" generates an optimized version at runtime, benefits from it, but breaks out of it when it's invalidated), although in this specific case that doesn't help as implementing NumPy on PyPy started only shortly ago and isn't even beta level to this day. But the bottom line is: Don't worry about optimizations on this level in Python. Either the implementations optimize it themselves or they don't, it's not your responsibility.

Solution 2 - Python

Not exactly what the OP has asked for, but close:

> Inliner inlines Python function calls. Proof of concept for this > blog > post > > from inliner import inline > > @inline > def add_stuff(x, y): > return x + y > > def add_lots_of_numbers(): > results = [] > for i in xrange(10): > results.append(add_stuff(i, i+1)) > > In the above code the add_lots_of_numbers function is converted into > this: > > def add_lots_of_numbers(): > results = [] > for i in xrange(10): > results.append(i + i + 1)

Also anyone interested in this question and the complications involved in implementing such optimizer in CPython, might also want to have a look at:

Solution 3 - Python

I'll agree with everyone else that such optimizations will just cause you pain on CPython, that if you care about performance you should consider PyPy (though our NumPy may be too incomplete to be useful). However I'll disagree and say you can care about such optimizations on PyPy, not this one specifically as has been said PyPy does that automatically, but if you know PyPy well you really can tune your code to make PyPy emit the assembly you want, not that you need to almost ever.

Solution 4 - Python

No.

The closest you can get to C macros is a script (awk or other) that you may include in a makefile, and which substitutes a certain pattern like abs(x)**2 in your python scripts with the long form.

Solution 5 - Python

Actually it might be even faster to calculate, like:

x.real** 2+ x.imag** 2

Thus, the extra cost of function call will likely to diminish. Lets see:

In []: n= 1e4
In []: x= randn(n, 1)+ 1j* rand(n, 1)
In []: %timeit x.real* x.real+ x.imag* x.imag
10000 loops, best of 3: 100 us per loop
In []: %timeit x.real** 2+ x.imag** 2
10000 loops, best of 3: 77.9 us per loop

And encapsulating the calculation in a function:

In []: def abs2(x):
   ..:     return x.real** 2+ x.imag** 2
   ..: 
In []: %timeit abs2(x)
10000 loops, best of 3: 80.1 us per loop

Anyway (as other have pointed out) this kind of micro-optimization (in order to avoid a function call) is not really productive way to write python code.

Solution 6 - Python

You can try to use lambda:

abs2 = lambda x : x.real*x.real+x.imag*x.imag

then call it by:

y = abs2(x)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCharles BrunetView Question on Stackoverflow
Solution 1 - Pythonuser395760View Answer on Stackoverflow
Solution 2 - PythonAXOView Answer on Stackoverflow
Solution 3 - PythonAlex GaynorView Answer on Stackoverflow
Solution 4 - PythonThaddee TylView Answer on Stackoverflow
Solution 5 - PythoneatView Answer on Stackoverflow
Solution 6 - PythonLinfeng MuView Answer on Stackoverflow