Convert structured array to regular NumPy array
PythonNumpyRecarrayPython Problem Overview
The answer will be very obvious I think, but I don't see it at the moment.
How can I convert a record array back to a regular ndarray?
Suppose I have following simple structured array:
x = np.array([(1.0, 4.0,), (2.0, -1.0)], dtype=[('f0', '<f8'), ('f1', '<f8')])
then I want to convert it to:
array([[ 1., 4.],
[ 2., -1.]])
I tried asarray
and astype
, but that didn't work.
UPDATE (solved: float32 (f4) instead of float64 (f8))
OK, I tried the solution of Robert (x.view(np.float64).reshape(x.shape + (-1,))
), and with a simple array it works perfectly. But with the array I wanted to convert it gives a strange outcome:
data = np.array([ (0.014793682843446732, 0.006681123282760382, 0.0, 0.0, 0.0, 0.0008984912419691682, 0.0, 0.013475529849529266, 0.0, 0.0),
(0.014793682843446732, 0.006681123282760382, 0.0, 0.0, 0.0, 0.0008984912419691682, 0.0, 0.013475529849529266, 0.0, 0.0),
(0.014776384457945824, 0.006656022742390633, 0.0, 0.0, 0.0, 0.0008901208057068288, 0.0, 0.013350814580917358, 0.0, 0.0),
(0.011928378604352474, 0.002819152781739831, 0.0, 0.0, 0.0, 0.0012627150863409042, 0.0, 0.018906937912106514, 0.0, 0.0),
(0.011928378604352474, 0.002819152781739831, 0.0, 0.0, 0.0, 0.001259754877537489, 0.0, 0.01886274479329586, 0.0, 0.0),
(0.011969991959631443, 0.0028706740122288465, 0.0, 0.0, 0.0, 0.0007433745195157826, 0.0, 0.011164642870426178, 0.0, 0.0)],
dtype=[('a_soil', '<f4'), ('b_soil', '<f4'), ('Ea_V', '<f4'), ('Kcc', '<f4'), ('Koc', '<f4'), ('Lmax', '<f4'), ('malfarquhar', '<f4'), ('MRN', '<f4'), ('TCc', '<f4'), ('Vcmax_3', '<f4')])
and then:
data_array = data.view(np.float).reshape(data.shape + (-1,))
gives:
In [8]: data_array
Out[8]:
array([[ 2.28080997e-20, 0.00000000e+00, 2.78023241e-27,
6.24133580e-18, 0.00000000e+00],
[ 2.28080997e-20, 0.00000000e+00, 2.78023241e-27,
6.24133580e-18, 0.00000000e+00],
[ 2.21114197e-20, 0.00000000e+00, 2.55866881e-27,
5.79825816e-18, 0.00000000e+00],
[ 2.04776835e-23, 0.00000000e+00, 3.47457730e-26,
9.32782857e-17, 0.00000000e+00],
[ 2.04776835e-23, 0.00000000e+00, 3.41189244e-26,
9.20222417e-17, 0.00000000e+00],
[ 2.32706550e-23, 0.00000000e+00, 4.76375305e-28,
1.24257748e-18, 0.00000000e+00]])
which is an array with other numbers and another shape. What did I do wrong?
Python Solutions
Solution 1 - Python
The simplest method is probably
x.view((float, len(x.dtype.names)))
(float
must generally be replaced by the type of the elements in x
: x.dtype[0]
). This assumes that all the elements have the same type.
This method gives you the regular numpy.ndarray
version in a single step (as opposed to the two steps required by the view(…).reshape(…)
method.
Solution 2 - Python
[~]
|5> x = np.array([(1.0, 4.0,), (2.0, -1.0)], dtype=[('f0', '<f8'), ('f1', '<f8')])
[~]
|6> x.view(np.float64).reshape(x.shape + (-1,))
array([[ 1., 4.],
[ 2., -1.]])
Solution 3 - Python
np.array(x.tolist())
array([[ 1., 4.],
[ 2., -1.]])
but maybe there is a better method...
Solution 4 - Python
In conjunction with changes on how it handle multi-field indexing numpy
has provided two new functions that can help in converting to/from structured arrays:
In numpy.lib.recfunctions
, these are structured_to_unstructured
and unstructured_to_structured
. repack_fields
is another new function.
From the 1.16
release notes
>multi-field views return a view instead of a copy > >Indexing a structured array with multiple fields, e.g., arr[['f1', 'f3']], returns a view into the original array instead of a copy. The returned view will often have extra padding bytes corresponding to intervening fields in the original array, unlike before, which will affect code such as arr[['f1', 'f3']].view('float64'). This change has been planned since numpy 1.7. Operations hitting this path have emitted FutureWarnings since then. Additional FutureWarnings about this change were added in 1.12. > >To help users update their code to account for these changes, a number of functions have been added to the numpy.lib.recfunctions module which safely allow such operations. For instance, the code above can be replaced with structured_to_unstructured(arr[['f1', 'f3']], dtype='float64'). See the “accessing multiple fields” section of the user guide.
Solution 5 - Python
A very simple solution using the function rec2array of root_numpy:
np_array = rec2array(x)
root_numpy is actually deprecated but the rec2array code is useful anyway (source here):
def rec2array(rec, fields=None):
simplify = False
if fields is None:
fields = rec.dtype.names
elif isinstance(fields, string_types):
fields = [fields]
simplify = True
# Creates a copy and casts all data to the same type
arr = np.dstack([rec[field] for field in fields])
# Check for array-type fields. If none, then remove outer dimension.
# Only need to check first field since np.dstack will anyway raise an
# exception if the shapes don't match
# np.dstack will also fail if fields is an empty list
if not rec.dtype[fields[0]].shape:
arr = arr[0]
if simplify:
# remove last dimension (will be of size 1)
arr = arr.reshape(arr.shape[:-1])
return arr