Deleting multiple columns based on column names in Pandas

PythonPandas

Python Problem Overview


I have some data and when I import it, I get the following unneeded columns. I'm looking for an easy way to delete all of these.

'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',
'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',
'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',
'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',
'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',
'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',
'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',
'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',
'Unnamed: 60'

They are indexed by 0-indexing so I tried something like

df.drop(df.columns[[22, 23, 24, 25, 
26, 27, 28, 29, 30, 31, 32 ,55]], axis=1, inplace=True)

But this isn't very efficient. I tried writing some for loops but this struck me as bad Pandas behaviour. Hence i ask the question here.

I've seen some examples which are similar (https://stackoverflow.com/questions/26347412/drop-multiple-columns-pandas) but this doesn't answer my question.

Python Solutions


Solution 1 - Python

By far the simplest approach is:

yourdf.drop(['columnheading1', 'columnheading2'], axis=1, inplace=True)

Solution 2 - Python

I don't know what you mean by inefficient but if you mean in terms of typing it could be easier to just select the cols of interest and assign back to the df:

df = df[cols_of_interest]

Where cols_of_interest is a list of the columns you care about.

Or you can slice the columns and pass this to drop:

df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)

The call to head just selects 0 rows as we're only interested in the column names rather than data

update

Another method: It would be simpler to use the boolean mask from str.contains and invert it to mask the columns:

In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df

Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []

In [4]:
~df.columns.str.contains('Unnamed:')

Out[4]:
array([ True, False, False,  True], dtype=bool)

In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]

Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []

Solution 3 - Python

My personal favorite, and easier than the answers I have seen here (for multiple columns):

df.drop(df.columns[22:56], axis=1, inplace=True)

Solution 4 - Python

This is probably a good way to do what you want. It will delete all columns that contain 'Unnamed' in their header.

for col in df.columns:
	if 'Unnamed' in col:
		del df[col]

Solution 5 - Python

You can do this in one line and one go:

df.drop([col for col in df.columns if "Unnamed" in col], axis=1, inplace=True)

This involves less moving around/copying of the object than the solutions above.

Solution 6 - Python

Not sure if this solution has been mentioned anywhere yet but one way to do is is pandas.Index.difference.

>>> df = pd.DataFrame(columns=['A','B','C','D'])
>>> df
Empty DataFrame
Columns: [A, B, C, D]
Index: []
>>> to_remove = ['A','C']
>>> df = df[df.columns.difference(to_remove)]
>>> df
Empty DataFrame
Columns: [B, D]
Index: []

Solution 7 - Python

You can just pass the column names as a list with specifying the axis as 0 or 1

  • axis=1: Along the Rows

  • axis=0: Along the Columns

  • By default axis=0

    data.drop(["Colname1","Colname2","Colname3","Colname4"],axis=1)

Solution 8 - Python

Simple and Easy. Remove all columns after the 22th.

df.drop(columns=df.columns[22:]) # love it

Solution 9 - Python

The below worked for me:

for col in df:
	if 'Unnamed' in col:
		#del df[col]
		print col
		try:
			df.drop(col, axis=1, inplace=True)
		except Exception:
			pass
		

Solution 10 - Python

df = df[[col for col in df.columns if not ('Unnamed' in col)]]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPeadar CoyleView Question on Stackoverflow
Solution 1 - PythonPhilipp SchwarzView Answer on Stackoverflow
Solution 2 - PythonEdChumView Answer on Stackoverflow
Solution 3 - PythonsheldonzyView Answer on Stackoverflow
Solution 4 - PythonknightofniView Answer on Stackoverflow
Solution 5 - PythonPeterView Answer on Stackoverflow
Solution 6 - Pythonpx06View Answer on Stackoverflow
Solution 7 - PythonMaddu SwaroopView Answer on Stackoverflow
Solution 8 - PythonNiedsonView Answer on Stackoverflow
Solution 9 - PythonShivganView Answer on Stackoverflow
Solution 10 - PythonSarahView Answer on Stackoverflow