Remove Unnamed columns in pandas dataframe

PythonPandasDataframe

Python Problem Overview


I have a data file from columns A-G like below but when I am reading it with pd.read_csv('data.csv') it prints an extra unnamed column at the end for no reason.

colA	ColB	colC	colD	colE	colF	colG    Unnamed: 7
44	    45	    26	    26	    40	    26	    46	      NaN
47	    16	    38	    47	    48	    22	    37	      NaN
19	    28	    36	    18	    40	    18	    46	      NaN
50	    14	    12	    33	    12	    44	    23	      NaN
39	    47	    16	    42	    33	    48	    38	      NaN

I have seen my data file various times but I have no extra data in any other column. How I should remove this extra column while reading ? Thanks

Python Solutions


Solution 1 - Python

df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

In [162]: df
Out[162]:
   colA  ColB  colC  colD  colE  colF  colG
0    44    45    26    26    40    26    46
1    47    16    38    47    48    22    37
2    19    28    36    18    40    18    46
3    50    14    12    33    12    44    23
4    39    47    16    42    33    48    38

if the first column in the CSV file has index values, then you can do this instead:

df = pd.read_csv('data.csv', index_col=0)

Solution 2 - Python

First, find the columns that have 'unnamed', then drop those columns. Note: You should Add inplace = True to the .drop parameters as well.

df.drop(df.columns[df.columns.str.contains('unnamed',case = False)],axis = 1, inplace = True)

Solution 3 - Python

The pandas.DataFrame.dropna function removes missing values (e.g. NaN, NaT).

For example the following code would remove any columns from your dataframe, where all of the elements of that column are missing.

df.dropna(how='all', axis='columns')

Solution 4 - Python

The approved solution doesn't work in my case, so my solution is the following one:

    ''' The column name in the example case is "Unnamed: 7"
 but it works with any other name ("Unnamed: 0" for example). '''
    
        df.rename({"Unnamed: 7":"a"}, axis="columns", inplace=True)
        
        # Then, drop the column as usual.
        
        df.drop(["a"], axis=1, inplace=True)

Hope it helps others.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmuazfaizView Question on Stackoverflow
Solution 1 - PythonMaxU - stop genocide of UAView Answer on Stackoverflow
Solution 2 - PythonAdil WarsiView Answer on Stackoverflow
Solution 3 - PythonSusanView Answer on Stackoverflow
Solution 4 - PythonEzarate11View Answer on Stackoverflow