Copy all values in a column to a new column in a pandas dataframe

PythonPandas

Python Problem Overview


This is a very basic question, I just can not seem to find an answer.

I have a dataframe like this, called df:

  A     B     C
 a.1   b.1   c.1
 a.2   b.2   c.2
 a.3   b.3   c.3

Then I extract all the rows from df, where column 'B' has a value of 'b.2'. I assign these results to df_2.

df_2 = df[df['B'] == 'b.2']

df_2 becomes:

  A     B     C
 a.2   b.2   c.2

Then, I copy all the values in column 'B' to a new column named 'D'. Causing df_2 to become:

  A     B     C     D
 a.2   b.2   c.2   b.2

When I preform an assignment like this:

df_2['D'] = df_2['B']

I get the following warning:

> A value is trying to be set on a copy of a slice from a DataFrame. Try > using .loc[row_indexer,col_indexer] = value instead > > See the the caveats in the documentation: > http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


I have also tried using .loc when creating df_2 like this:

df_2 = df.loc[df['B'] == 'b.2']

However, I still get the warning.

Any help is greatly appreciated.

Python Solutions


Solution 1 - Python

You can simply assign the B to the new column , Like -

df['D'] = df['B']

Example/Demo -

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([['a.1','b.1','c.1'],['a.2','b.2','c.2'],['a.3','b.3','c.3']],columns=['A','B','C'])

In [3]: df
Out[3]:
     A    B    C
0  a.1  b.1  c.1
1  a.2  b.2  c.2
2  a.3  b.3  c.3

In [4]: df['D'] = df['B']                  #<---What you want.

In [5]: df
Out[5]:
     A    B    C    D
0  a.1  b.1  c.1  b.1
1  a.2  b.2  c.2  b.2
2  a.3  b.3  c.3  b.3

In [6]: df.loc[0,'D'] = 'd.1'

In [7]: df
Out[7]:
     A    B    C    D
0  a.1  b.1  c.1  d.1
1  a.2  b.2  c.2  b.2
2  a.3  b.3  c.3  b.3

Solution 2 - Python

The problem is in the line before the one that throws the warning. When you create df_2 that's where you're creating a copy of a slice of a dataframe. Instead, when you create df_2, use .copy() and you won't get that warning later on.

df_2 = df[df['B'] == 'b.2'].copy()

Solution 3 - Python

I think the correct access method is using the index:

df_2.loc[:,'D'] = df_2['B']

Solution 4 - Python

How about:

df['D'] = df['B'].values

Solution 5 - Python

Here is your dataframe:

import pandas as pd
df = pd.DataFrame({
    'A': ['a.1', 'a.2', 'a.3'],
    'B': ['b.1', 'b.2', 'b.3'],
    'C': ['c.1', 'c.2', 'c.3']})

Your answer is in the paragraph "Setting with enlargement" in the section on "Indexing and selecting data" in the documentation on Pandas.

It says:

> A DataFrame can be enlarged on either axis via .loc.

So what you need to do is simply one of these two:

df.loc[:, 'D'] = df.loc[:, 'B']
df.loc[:, 'D'] = df['B']

Solution 6 - Python

You can use the method assign. It returns a new DataFrame so you can use it in chains with other methods.

df.assign(D=df.B)

Output:

     A    B    C    D
0  a.1  b.1  c.1  b.1
1  a.2  b.2  c.2  b.2
2  a.3  b.3  c.3  b.3

Solution 7 - Python

The question was asked a while ago, but my response could help others.

I had a similar situation. When you sliced a dataframe into df_2, you need to reset index,

df_2 = df_2.reset_index(drop = True)  

Now you can run the command without warning

df_2['D'] = df_2['B']

Solution 8 - Python

Following up on these solutions, here is some helpful code illustrating :

#
# Copying columns in pandas without slice warning
#
import numpy as np
df = pd.DataFrame(np.random.randn(10, 3), columns=list('ABC'))

#
# copies column B into new column D
df.loc[:,'D'] = df['B']
print df

#
# creates new column 'E' with values -99
# 
# But copy command replaces those where 'B'>0 while others become NaN (not copied)
df['E'] = -99
print df
df['E'] = df[df['B']>0]['B'].copy()
print df

#
# creates new column 'F' with values -99
# 
# Copy command only overwrites values which meet criteria 'B'>0
df['F']=-99
df.loc[df['B']>0,'F'] = df[df['B']>0]['B'].copy()
print df

Solution 9 - Python

eval lets you assign B to the new columns D right away:

In [8]: df.eval('D=B', inplace=True)

In [9]: df
Out[9]: 
     A    B    C    D
0  a.1  b.1  c.1  b.1
1  a.2  b.2  c.2  b.2
2  a.3  b.3  c.3  b.3

Since inplace=True you don't need to assign it back to df.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJustin BuchananView Question on Stackoverflow
Solution 1 - PythonAnand S KumarView Answer on Stackoverflow
Solution 2 - PythonAlexView Answer on Stackoverflow
Solution 3 - PythonLukeView Answer on Stackoverflow
Solution 4 - PythonKarthikSView Answer on Stackoverflow
Solution 5 - Pythontommy.carstensenView Answer on Stackoverflow
Solution 6 - PythonMykola ZotkoView Answer on Stackoverflow
Solution 7 - PythonIrshadView Answer on Stackoverflow
Solution 8 - PythonMark AndersenView Answer on Stackoverflow
Solution 9 - PythonrachwaView Answer on Stackoverflow