numpy: Efficiently avoid 0s when taking log(matrix)

Numpy Problem Overview

from numpy import *

m = array([[1,0],
           [2,3]])

I would like to compute the element-wise log2(m), but only in the places where m is not 0. In those places, I would like to have 0 as a result.

I am now fighting against:

RuntimeWarning: divide by zero encountered in log2

Try 1: using where

res = where(m != 0, log2(m), 0)

which computes me the correct result, but I still get logged a RuntimeWarning: divide by zero encountered in log2. It looks like (and syntactically it is quite obvious) numpy still computes log2(m) on the full matrix and only afterwards where picks the values to keep.

I would like to avoid this warning.

Try 2: using masks

from numpy import ma

res = ma.filled(log2(ma.masked_equal(m, 0)), 0)

Sure masking away the zeros will prevent log2 to get applied to them, won't it? Unfortunately not: We still get RuntimeWarning: divide by zero encountered in log2.

Even though the matrix is masked, log2 still seems to be applied to every element.

How can I efficiently compute the element-wise log of a numpy array without getting division-by-zero warnings?

Of course I could temporarily disable the logging of these warnings using seterr, but that doesn't look like a clean solution.
And sure a double for loop would help with treating 0s specially, but defeats the efficiency of numpy.

Any ideas?

Numpy Solutions

Solution 1 - Numpy

We can use masked arrays for this:

>>> from numpy import *
>>> m = array([[1,0], [2,3]])
>>> x = ma.log(m)
>>> print x.filled(0)
[[ 0.          0.        ]
 [ 0.69314718  1.09861229]]

Solution 2 - Numpy

Another option is to use the where parameter of numpy's ufuncs:

m = np.array([[1., 0], [2, 3]])
res = np.log2(m, out=np.zeros_like(m), where=(m!=0))

No RuntimeWarning is raised, and zeros are introduced where the log is not computed.

Solution 3 - Numpy

Simply disable the warning for that computation:

from numpy import errstate,isneginf,array

m = array([[1,0],[2,3]])
with errstate(divide='ignore'):
    res = log2(m)

And then you can postprocess the -inf if you want:

res[isneginf(res)]=0

EDIT: I put here some comments about the other option, which is using masked arrays, posted in the other answer. You should opt for disabling the error for two reasons:

Using masked arrays is by far less efficient then disabling momentarily the error, and you asked for efficiency.
Disabling the specific 'divide by zero' warning does NOT disable the other problem with calculating the log of a number, which is negative input. Negative input is captured as an 'invalid value' warning, and you will have to deal with it.

On the other hand, using masked arrays captures the two errors as the same, and will lead you to not notice a negative number in the input. In other words, a negative number in the input is treated like a zero, and will give zero as a result. This is not what you asked.

As a last point and as a personal opinion, disabling the warning is very readable, it is obvious what the code is doing and makes it more mantainable. In that respect, I find this solution cleaner then using masked arrays.

Solution 4 - Numpy

The masked array solution and the solution that disables the warning are both fine. For variety, here's another that uses scipy.special.xlogy. np.sign(m) is given as the x argument, so xlogy returns 0 wherever np.sign(m) is 0. The result is divided by np.log(2) to give the base-2 logarithm.

In [4]: from scipy.special import xlogy

In [5]: m = np.array([[1, 0], [2, 3]])

In [6]: xlogy(np.sign(m), m) / np.log(2)
Out[6]: 
array([[ 0.       ,  0.       ],
       [ 1.       ,  1.5849625]])

Solution 5 - Numpy

Problem

Questions: Feb 2014, May 2012

For an array containing zeros or negatives we get the respective errors.

y = np.log(x)
# RuntimeWarning: divide by zero encountered in log
# RuntimeWarning: invalid value encountered in log

Solution

markroxor suggests np.clip, in my example this creates a horizontal floor. gg349 and others use np.errstate and np.seterr, I think these are clunky and does not solve the problem. As a note np.complex doesn't work for zeros. user3315095 uses indexing p=0<x, and NumPy.log has this functionality built in, where/out. mdeff demonstrates this, but replaces the -inf with 0 which for me was insufficient, and doesn't solve for negatives.

I suggest 0<x and np.nan (or if needed np.NINF/-np.inf).

y = np.log(x, where=0<x, out=np.nan*x)

John Zwinck uses mask matrix np.ma.log this works but is computationally slower, try App:timeit.

Example

import numpy as np
x = np.linspace(-10, 10, 300)

# y = np.log(x)                         # Old
y = np.log(x, where=0<x, out=np.nan*x)  # New

import matplotlib.pyplot as plt
plt.plot(x, y)
plt.show()

App:timeit

Time Comparison for mask and where

import numpy as np
import time
def timeit(fun, xs):
	t = time.time()
	for i in range(len(xs)):
		fun(xs[i])
	print(time.time() - t)

xs = np.random.randint(-10,+10, (1000,10000))
timeit(lambda x: np.ma.log(x).filled(np.nan), xs)
timeit(lambda x: np.log(x, where=0<x, out=np.nan*x), xs)

Solution 6 - Numpy

What about the following

from numpy import *
m=array((-1.0,0.0,2.0))
p=m > 0.0
print 'positive=',p
print m[p]
res=zeros_like(m)
res[p]=log(m[p])
print res

Solution 7 - Numpy

You can use something like - m = np.clip(m, 1e-12, None) to avoid the log(0) error. This will set the lower bound to 1e-12.

Content Type	Original Author	Original Content on Stackoverflow
Question	nh2	View Question on Stackoverflow
Solution 1 - Numpy	John Zwinck	View Answer on Stackoverflow
Solution 2 - Numpy	mdeff	View Answer on Stackoverflow
Solution 3 - Numpy	gg349	View Answer on Stackoverflow
Solution 4 - Numpy	Warren Weckesser	View Answer on Stackoverflow
Solution 5 - Numpy	A. West	View Answer on Stackoverflow
Solution 6 - Numpy	user3315095	View Answer on Stackoverflow
Solution 7 - Numpy	markroxor	View Answer on Stackoverflow