Append an empty row in dataframe using pandas

PythonPython 2.7Pandas

Python Problem Overview


I am trying to append an empty row at the end of dataframe but unable to do so, even trying to understand how pandas work with append function and still not getting it.

Here's the code:

import pandas as pd

excel_names = ["ARMANI+EMPORIO+AR0143-book.xlsx"]
excels = [pd.ExcelFile(name) for name in excel_names]
frames = [x.parse(x.sheet_names[0], header=None,index_col=None).dropna(how='all') for x in excels]
for f in frames:
    f.append(0, float('NaN'))
    f.append(2, float('NaN'))

There are two columns and random number of row.

with "print f" in for loop i Get this:

                             0                 1
0                   Brand Name    Emporio Armani
2                 Model number            AR0143
4                  Part Number            AR0143
6                   Item Shape       Rectangular
8   Dial Window Material Type           Mineral
10               Display Type          Analogue
12                 Clasp Type            Buckle
14               Case Material   Stainless steel
16              Case Diameter    31 millimetres
18               Band Material           Leather
20                 Band Length  Women's Standard
22                 Band Colour             Black
24                 Dial Colour             Black
26            Special Features       second-hand
28                    Movement            Quartz

Python Solutions


Solution 1 - Python

Add a new pandas.Series using pandas.DataFrame.append().

If you wish to specify the name (AKA the "index") of the new row, use:

df.append(pandas.Series(name='NameOfNewRow'))

If you don't wish to name the new row, use:

df.append(pandas.Series(), ignore_index=True)

where df is your pandas.DataFrame.

Solution 2 - Python

You can add it by appending a Series to the dataframe as follows. I am assuming by blank you mean you want to add a row containing only "Nan". You can first create a Series object with Nan. Make sure you specify the columns while defining 'Series' object in the -Index parameter. The you can append it to the DF. Hope it helps!

from numpy import nan as Nan
import pandas as pd

>>> df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
...                     'B': ['B0', 'B1', 'B2', 'B3'],
...                     'C': ['C0', 'C1', 'C2', 'C3'],
...                     'D': ['D0', 'D1', 'D2', 'D3']},
...                     index=[0, 1, 2, 3])

>>> s2 = pd.Series([Nan,Nan,Nan,Nan], index=['A', 'B', 'C', 'D'])
>>> result = df1.append(s2)
>>> result
     A    B    C    D
0   A0   B0   C0   D0
1   A1   B1   C1   D1
2   A2   B2   C2   D2
3   A3   B3   C3   D3
4  NaN  NaN  NaN  NaN

Solution 3 - Python

Assuming df is your dataframe,

df_prime = pd.concat([df, pd.DataFrame([[np.nan] * df.shape[1]], columns=df.columns)], ignore_index=True)

where df_prime equals df with an additional last row of NaN's.

Note that pd.concat is slow so if you need this functionality in a loop, it's best to avoid using it. In that case, assuming your index is incremental, you can use

df.loc[df.iloc[-1].name + 1,:] = np.nan

Solution 4 - Python

You can add a new series, and name it at the same time. The name will be the index of the new row, and all the values will automatically be NaN.

df.append(pd.Series(name='Afterthought'))

Solution 5 - Python

Append "empty" row to data frame and fill selected cells:

Generate empty data frame (no rows just columns a and b):

import pandas as pd    
col_names =  ["a","b"]
df  = pd.DataFrame(columns = col_names)

Append empty row at the end of the data frame:

df = df.append(pd.Series(), ignore_index = True)

Now fill the empty cell at the end (len(df)-1) of the data frame in column a:

df.loc[[len(df)-1],'a'] = 123

Result:

     a    b
0  123  NaN

And of course one can iterate over the rows and fill cells:

col_names =  ["a","b"]
df  = pd.DataFrame(columns = col_names)
for x in range(0,5):
    df = df.append(pd.Series(), ignore_index = True)
    df.loc[[len(df)-1],'a'] = 123

Result:

     a    b
0  123  NaN
1  123  NaN
2  123  NaN
3  123  NaN
4  123  NaN

Solution 6 - Python

The code below worked for me.

df.append(pd.Series([np.nan]), ignore_index = True)

Solution 7 - Python

Assuming your df.index is sorted you can use:

df.loc[df.index.max() + 1] = None

It handles well different indexes and column types.

[EDIT] it works with pd.DatetimeIndex if there is a constant frequency, otherwise we must specify the new index exactly e.g:

df.loc[df.index.max() + pd.Timedelta(milliseconds=1)] = None

long example:

df = pd.DataFrame([[pd.Timestamp(12432423), 23, 'text_field']], 
                    columns=["timestamp", "speed", "text"],
                    index=pd.DatetimeIndex(start='2111-11-11',freq='ms', periods=1))
df.info()

<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1 entries, 2111-11-11 to 2111-11-11 Freq: L Data columns (total 3 columns): timestamp 1 non-null datetime64[ns] speed 1 non-null int64 text 1 non-null object dtypes: datetime64[ns](1), int64(1), object(1) memory usage: 32.0+ bytes

df.loc[df.index.max() + 1] = None
df.info()

<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2 entries, 2111-11-11 00:00:00 to 2111-11-11 00:00:00.001000 Data columns (total 3 columns): timestamp 1 non-null datetime64[ns] speed 1 non-null float64 text 1 non-null object dtypes: datetime64[ns](1), float64(1), object(1) memory usage: 64.0+ bytes

df.head()

                            timestamp               	speed	   text
2111-11-11 00:00:00.000	1970-01-01 00:00:00.012432423	23.0	text_field
2111-11-11 00:00:00.001	NaT	NaN	NaN

Solution 8 - Python

You can also use:

your_dataframe.insert(loc=0, value=np.nan, column="")

where loc is your empty row index.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMansoor AkramView Question on Stackoverflow
Solution 1 - PythonsrcererView Answer on Stackoverflow
Solution 2 - Pythonsilent_devView Answer on Stackoverflow
Solution 3 - PythonDave ReikherView Answer on Stackoverflow
Solution 4 - PythonpocketdoraView Answer on Stackoverflow
Solution 5 - PythonPeterView Answer on Stackoverflow
Solution 6 - Pythonkamal tanwarView Answer on Stackoverflow
Solution 7 - PythonDaniel RView Answer on Stackoverflow
Solution 8 - PythonAlberto GarciaView Answer on Stackoverflow