Convert row to column header for Pandas DataFrame,

Python Problem Overview

The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header?

I want to do something like:

header = df[df['old_header_name1'] == 'new_header_name1']

df.columns = header

Python Solutions

Solution 1 - Python

In [21]: df = pd.DataFrame([(1,2,3), ('foo','bar','baz'), (4,5,6)])

In [22]: df
Out[22]: 
     0    1    2
0    1    2    3
1  foo  bar  baz
2    4    5    6

Set the column labels to equal the values in the 2nd row (index location 1):

In [23]: df.columns = df.iloc[1]

If the index has unique labels, you can drop the 2nd row using:

In [24]: df.drop(df.index[1])
Out[24]: 
1 foo bar baz
0   1   2   3
2   4   5   6

If the index is not unique, you could use:

In [133]: df.iloc[pd.RangeIndex(len(df)).drop(1)]
Out[133]: 
1 foo bar baz
0   1   2   3
2   4   5   6

Using df.drop(df.index[1]) removes all rows with the same label as the second row. Because non-unique indexes can lead to stumbling blocks (or potential bugs) like this, it's often better to take care that the index is unique (even though Pandas does not require it).

Solution 2 - Python

This works (pandas v'0.19.2'):

df.rename(columns=df.iloc[0])

Solution 3 - Python

It would be easier to recreate the data frame. This would also interpret the columns types from scratch.

headers = df.iloc[0]
new_df  = pd.DataFrame(df.values[1:], columns=headers)

Solution 4 - Python

To rename the header without reassign df:

df.rename(columns=df.iloc[0], inplace = True)

To drop the row without reassign df:

df.drop(df.index[0], inplace = True)

Solution 5 - Python

You can specify the row index in the read_csv or read_html constructors via the header parameter which represents Row number(s) to use as the column names, and the start of the data. This has the advantage of automatically dropping all the preceding rows which supposedly are junk.

import pandas as pd
from io import StringIO

In[1]
    csv = '''junk1, junk2, junk3, junk4, junk5
    junk1, junk2, junk3, junk4, junk5
    pears, apples, lemons, plums, other
    40, 50, 61, 72, 85
    '''

    df = pd.read_csv(StringIO(csv), header=2)
    print(df)

Out[1]
       pears   apples   lemons   plums   other
    0     40       50       61      72      85

Solution 6 - Python

Just for fun :

df.T.set_index(0).T

Transposes the dataframe, sets index with the first row and transposes again, so, no need to remove the fist row :)

Content Type	Original Author	Original Content on Stackoverflow
Question	E.K.	View Question on Stackoverflow
Solution 1 - Python	unutbu	View Answer on Stackoverflow
Solution 2 - Python	Zachary Wilson	View Answer on Stackoverflow
Solution 3 - Python	shahar_m	View Answer on Stackoverflow
Solution 4 - Python	Govinda	View Answer on Stackoverflow
Solution 5 - Python	ccpizza	View Answer on Stackoverflow
Solution 6 - Python	Arnaud Desplanche	View Answer on Stackoverflow