Copying a column from one DataFrame to another gives NaN values?

PythonPandasTypesCopy

Python Problem Overview


This question has been asked so many times, and it seemed to work for others, however, I am getting NaN values when I copy a column from a different DataFrame(df1 and df2 are same length).

df1

        date	 hour	   var1
a	2017-05-01	00:00:00   456585
b	2017-05-01	01:00:00   899875
c	2017-05-01	02:00:00   569566
d	2017-05-01	03:00:00   458756
e	2017-05-01	04:00:00   231458
f	2017-05-01	05:00:00   986545

df2

      MyVar1	 MyVar2	
 0	6169.719338	3688.045368
 1	5861.148007	3152.238704
 2	5797.053347	2700.469871
 3	5779.102340	2730.471948
 4	6708.219647	3181.298291
 5	8550.380343	3793.580394

I need like this in my df2

       MyVar1	 MyVar2	       date        hour
 0	6169.719338	3688.045368  2017-05-01  00:00:00
 1	5861.148007	3152.238704  2017-05-01  01:00:00
 2	5797.053347	2700.469871  2017-05-01  02:00:00
 3	5779.102340	2730.471948  2017-05-01  03:00:00
 4	6708.219647	3181.298291  2017-05-01  04:00:00
 5	8550.380343	3793.580394  2017-05-01  05:00:00

I tried the following,

df2['date'] = df1['date']
df2['hour'] = df1['hour']
 
type(df1)
>> pandas.core.frame.DataFrame

type(df2)
>> pandas.core.frame.DataFrame

I am getting the following,

       MyVar1	 MyVar2	     date       hour
 0	6169.719338	3688.045368  NaN        NaN
 1	5861.148007	3152.238704  NaN        NaN
 2	5797.053347	2700.469871  NaN        NaN

Why is this happening? There is another post that discusses merge, but I just need to copy it. Any help would be appreciated.

Python Solutions


Solution 1 - Python

The culprit is unalignable indexes

Your DataFrames' indexes are different (and correspondingly, the indexes for each columns), so when trying to assign a column of one DataFrame to another, pandas will try to align the indexes, and failing to do so, insert NaNs.

Consider the following examples to understand what this means:

# Setup
A = pd.DataFrame(index=['a', 'b', 'c']) 
B = pd.DataFrame(index=['b', 'c', 'd', 'f'])                                  
C = pd.DataFrame(index=[1, 2, 3])

# Example of alignable indexes - A & B (complete or partial overlap of indexes)
A.index B.index
      a        
      b       b   (overlap)
      c       c   (overlap)
              d
              f

# Example of unalignable indexes - A & C (no overlap at all)
A.index C.index
      a        
      b        
      c        
              1
              2
              3

When there are no overlaps, pandas cannot match even a single value between the two DataFrames to put in the result of the assignment, so the output is a column full of NaNs.

If you're working on an IPython notebook, you can check that this is indeed the root cause using,

df1.index.equals(df2.index)
# False
df1.index.intersection(df2.index).empty
# True

You can use any of the following solutions to fix this issue.

Solution 1: Reset both DataFrames' indexes

You may prefer this option if you didn't mean to have different indices in the first place, or if you don't particularly care about preserving the index.

# Optional, if you want a RangeIndex => [0, 1, 2, ...]
# df1.index = pd.RangeIndex(len(df))
# Homogenize the index values,
df2.index = df1.index
# Assign the columns.
df2[['date', 'hour']] = df1[['date', 'hour']]

If you want to keep the existing index, but as a column, you may use reset_index() instead.


Solution 2: Assign NumPy arrays (bypass index alignment)

This solution will only work if the lengths of the two DataFrames match.

# pandas >= 0.24
df2['date'] = df1['date'].to_numpy()
# pandas < 0.24
df2['date'] = df1['date'].values

To assign multiple columns easily, use,

df2[['date', 'hour']] = df1[['date', 'hour']].to_numpy()

Solution 2 - Python

Try this ?

df2['date'] = df1['date'].values
df2['hour'] = df1['hour'].values

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questioni.n.n.mView Question on Stackoverflow
Solution 1 - Pythoncs95View Answer on Stackoverflow
Solution 2 - PythonBENYView Answer on Stackoverflow