Add column with constant value to pandas dataframe

PythonPandas

Python Problem Overview


Given a DataFrame:

np.random.seed(0)
df = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])
df
    
          A         B         C
1  1.764052  0.400157  0.978738
2  2.240893  1.867558 -0.977278
3  0.950088 -0.151357 -0.103219

What is the simplest way to add a new column containing a constant value eg 0?

          A         B         C  new
1  1.764052  0.400157  0.978738    0
2  2.240893  1.867558 -0.977278    0
3  0.950088 -0.151357 -0.103219    0

This is my solution, but I don't know why this puts NaN into 'new' column?

df['new'] = pd.Series([0 for x in range(len(df.index))])

          A         B         C  new
1  1.764052  0.400157  0.978738  0.0
2  2.240893  1.867558 -0.977278  0.0
3  0.950088 -0.151357 -0.103219  NaN

Python Solutions


Solution 1 - Python

#Super simple in-place assignment: df['new'] = 0

For in-place modification, perform direct assignment. This assignment is broadcasted by pandas for each row.

df = pd.DataFrame('x', index=range(4), columns=list('ABC'))
df

   A  B  C
0  x  x  x
1  x  x  x
2  x  x  x
3  x  x  x

df['new'] = 'y'
# Same as,
# df.loc[:, 'new'] = 'y'
df

   A  B  C new
0  x  x  x   y
1  x  x  x   y
2  x  x  x   y
3  x  x  x   y

###Note for object columns

If you want to add an column of empty lists, here is my advice:

  • Consider not doing this. object columns are bad news in terms of performance. Rethink how your data is structured.

  • Consider storing your data in a sparse data structure. More information: sparse data structures

  • If you must store a column of lists, ensure not to copy the same reference multiple times.

      # Wrong
      df['new'] = [[]] * len(df)
      # Right
      df['new'] = [[] for _ in range(len(df))]
    

Generating a copy: df.assign(new=0)

If you need a copy instead, use DataFrame.assign:

df.assign(new='y')

   A  B  C new
0  x  x  x   y
1  x  x  x   y
2  x  x  x   y
3  x  x  x   y

And, if you need to assign multiple such columns with the same value, this is as simple as,

c = ['new1', 'new2', ...]
df.assign(**dict.fromkeys(c, 'y'))

   A  B  C new1 new2
0  x  x  x    y    y
1  x  x  x    y    y
2  x  x  x    y    y
3  x  x  x    y    y
Multiple column assignment

Finally, if you need to assign multiple columns with different values, you can use assign with a dictionary.

c = {'new1': 'w', 'new2': 'y', 'new3': 'z'}
df.assign(**c)

   A  B  C new1 new2 new3
0  x  x  x    w    y    z
1  x  x  x    w    y    z
2  x  x  x    w    y    z
3  x  x  x    w    y    z

Solution 2 - Python

With modern pandas you can just do:

df['new'] = 0

Solution 3 - Python

The reason this puts NaN into a column is because df.index and the Index of your right-hand-side object are different. @zach shows the proper way to assign a new column of zeros. In general, pandas tries to do as much alignment of indices as possible. One downside is that when indices are not aligned you get NaN wherever they aren't aligned. Play around with the reindex and align methods to gain some intuition for alignment works with objects that have partially, totally, and not-aligned-all aligned indices. For example here's how DataFrame.align() works with partially aligned indices:

In [7]: from pandas import DataFrame

In [8]: from numpy.random import randint

In [9]: df = DataFrame({'a': randint(3, size=10)})

In [10]:

In [10]: df
Out[10]:
   a
0  0
1  2
2  0
3  1
4  0
5  0
6  0
7  0
8  0
9  0

In [11]: s = df.a[:5]

In [12]: dfa, sa = df.align(s, axis=0)

In [13]: dfa
Out[13]:
   a
0  0
1  2
2  0
3  1
4  0
5  0
6  0
7  0
8  0
9  0

In [14]: sa
Out[14]:
0     0
1     2
2     0
3     1
4     0
5   NaN
6   NaN
7   NaN
8   NaN
9   NaN
Name: a, dtype: float64

Solution 4 - Python

Here is another one liner using lambdas (create column with constant value = 10)

df['newCol'] = df.apply(lambda x: 10, axis=1)

before

df
	A	        B	        C
1	1.764052	0.400157	0.978738
2	2.240893	1.867558	-0.977278
3	0.950088	-0.151357	-0.103219

after

df
        A	        B	        C	        newCol
    1	1.764052	0.400157	0.978738	10
    2	2.240893	1.867558	-0.977278	10
    3	0.950088	-0.151357	-0.103219	10

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionyemuView Question on Stackoverflow
Solution 1 - Pythoncs95View Answer on Stackoverflow
Solution 2 - PythonRoko MijicView Answer on Stackoverflow
Solution 3 - PythonPhillip CloudView Answer on Stackoverflow
Solution 4 - PythonGrant ShannonView Answer on Stackoverflow