How to normalize a NumPy array to within a certain range?

Python Arrays Numpy Scipy Convenience Methods

Python Problem Overview

After doing some processing on an audio or image array, it needs to be normalized within a range before it can be written back to a file. This can be done like so:

# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()

# Normalize image to between 0 and 255
image = image/(image.max()/255.0)

Is there a less verbose, convenience function way to do this? matplotlib.colors.Normalize() doesn't seem to be related.

Python Solutions

Solution 1 - Python

audio /= np.max(np.abs(audio),axis=0)
image *= (255.0/image.max())

Using /= and *= allows you to eliminate an intermediate temporary array, thus saving some memory. Multiplication is less expensive than division, so

image *= 255.0/image.max()    # Uses 1 division and image.size multiplications

is marginally faster than

image /= image.max()/255.0    # Uses 1+image.size divisions

Since we are using basic numpy methods here, I think this is about as efficient a solution in numpy as can be.

In-place operations do not change the dtype of the container array. Since the desired normalized values are floats, the audio and image arrays need to have floating-point point dtype before the in-place operations are performed. If they are not already of floating-point dtype, you'll need to convert them using astype. For example,

image = image.astype('float64')

Solution 2 - Python

If the array contains both positive and negative data, I'd go with:

import numpy as np

a = np.random.rand(3,2)

# Normalised [0,1]
b = (a - np.min(a))/np.ptp(a)

# Normalised [0,255] as integer: don't forget the parenthesis before astype(int)
c = (255*(a - np.min(a))/np.ptp(a)).astype(int)        

# Normalised [-1,1]
d = 2.*(a - np.min(a))/np.ptp(a)-1

If the array contains nan, one solution could be to just remove them as:

def nan_ptp(a):
    return np.ptp(a[np.isfinite(a)])

b = (a - np.nanmin(a))/nan_ptp(a)

However, depending on the context you might want to treat nan differently. E.g. interpolate the value, replacing in with e.g. 0, or raise an error.

Finally, worth mentioning even if it's not OP's question, standardization:

e = (a - np.mean(a)) / np.std(a)

Solution 3 - Python

You can also rescale using sklearn. The advantages are that you can adjust normalize the standard deviation, in addition to mean-centering the data, and that you can do this on either axis, by features, or by records.

from sklearn.preprocessing import scale
X = scale( X, axis=0, with_mean=True, with_std=True, copy=True )

The keyword arguments axis, with_mean, with_std are self explanatory, and are shown in their default state. The argument copy performs the operation in-place if it is set to False. Documentation here.

Solution 4 - Python

You are trying to min-max scale the values of audio between -1 and +1 and image between 0 and 255.

Using sklearn.preprocessing.minmax_scale, should easily solve your problem.

e.g.:

audio_scaled = minmax_scale(audio, feature_range=(-1,1))

and

shape = image.shape
image_scaled = minmax_scale(image.ravel(), feature_range=(0,255)).reshape(shape)

note: Not to be confused with the operation that scales the norm (length) of a vector to a certain value (usually 1), which is also commonly referred to as normalization.

Solution 5 - Python

You can use the "i" (as in idiv, imul..) version, and it doesn't look half bad:

image /= (image.max()/255.0)

For the other case you can write a function to normalize an n-dimensional array by colums:

def normalize_columns(arr):
    rows, cols = arr.shape
    for col in xrange(cols):
        arr[:,col] /= abs(arr[:,col]).max()

Solution 6 - Python

This answer to a similar question solved the problem for me with

np.interp(a, (a.min(), a.max()), (-1, +1))

Solution 7 - Python

A simple solution is using the scalers offered by the sklearn.preprocessing library.

scaler = sk.MinMaxScaler(feature_range=(0, 250))
scaler = scaler.fit(X)
X_scaled = scaler.transform(X)
# Checking reconstruction
X_rec = scaler.inverse_transform(X_scaled)

The error X_rec-X will be zero. You can adjust the feature_range for your needs, or even use a standart scaler sk.StandardScaler()

Solution 8 - Python

I tried following this, and got the error

TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

The numpy array I was trying to normalize was an integer array. It seems they deprecated type casting in versions > 1.10, and you have to use numpy.true_divide() to resolve that.

arr = np.array(img)
arr = np.true_divide(arr,[255.0],out=None)

img was an PIL.Image object.

Content Type	Original Author	Original Content on Stackoverflow
Question	endolith	View Question on Stackoverflow
Solution 1 - Python	unutbu	View Answer on Stackoverflow
Solution 2 - Python	user2821	View Answer on Stackoverflow
Solution 3 - Python	cjohnson318	View Answer on Stackoverflow
Solution 4 - Python	fabda01	View Answer on Stackoverflow
Solution 5 - Python	u0b34a0f6ae	View Answer on Stackoverflow
Solution 6 - Python	Surya Narayanan	View Answer on Stackoverflow
Solution 7 - Python	Pantelis	View Answer on Stackoverflow
Solution 8 - Python	srdg	View Answer on Stackoverflow