How to select all columns, except one column in pandas?

PythonPandas

Python Problem Overview


I have a dataframe look like this:

import pandas
import numpy as np
df = DataFrame(np.random.rand(4,4), columns = list('abcd'))
df
      a         b         c         d
0  0.418762  0.042369  0.869203  0.972314
1  0.991058  0.510228  0.594784  0.534366
2  0.407472  0.259811  0.396664  0.894202
3  0.726168  0.139531  0.324932  0.906575

How I can get all columns except column b?

Python Solutions


Solution 1 - Python

When the columns are not a MultiIndex, df.columns is just an array of column names so you can do:

df.loc[:, df.columns != 'b']

          a         c         d
0  0.561196  0.013768  0.772827
1  0.882641  0.615396  0.075381
2  0.368824  0.651378  0.397203
3  0.788730  0.568099  0.869127

Solution 2 - Python

Don't use ix. It's deprecated. The most readable and idiomatic way of doing this is df.drop():

>>> df

          a         b         c         d
0  0.175127  0.191051  0.382122  0.869242
1  0.414376  0.300502  0.554819  0.497524
2  0.142878  0.406830  0.314240  0.093132
3  0.337368  0.851783  0.933441  0.949598

>>> df.drop('b', axis=1)

          a         c         d
0  0.175127  0.382122  0.869242
1  0.414376  0.554819  0.497524
2  0.142878  0.314240  0.093132
3  0.337368  0.933441  0.949598

Note that by default, .drop() does not operate inplace; despite the ominous name, df is unharmed by this process. If you want to permanently remove b from df, do df.drop('b', inplace=True).

df.drop() also accepts a list of labels, e.g. df.drop(['a', 'b'], axis=1) will drop column a and b.

Solution 3 - Python

df[df.columns.difference(['b'])]

Out: 
          a         c         d
0  0.427809  0.459807  0.333869
1  0.678031  0.668346  0.645951
2  0.996573  0.673730  0.314911
3  0.786942  0.719665  0.330833

Solution 4 - Python

You can use df.columns.isin()

df.loc[:, ~df.columns.isin(['b'])]

When you want to drop multiple columns, as simple as:

df.loc[:, ~df.columns.isin(['col1', 'col2'])]

Solution 5 - Python

You can drop columns in index:

df[df.columns.drop('b')]

or

df.loc[:, df.columns.drop('b')]

Output:

          a         c         d
0  0.418762  0.869203  0.972314
1  0.991058  0.594784  0.534366
2  0.407472  0.396664  0.894202
3  0.726168  0.324932  0.906575

Solution 6 - Python

Here is another way:

df[[i for i in list(df.columns) if i != '<your column>']]

You just pass all columns to be shown except of the one you do not want.

Solution 7 - Python

Here is a one line lambda:

df[map(lambda x :x not in ['b'], list(df.columns))]

before:

import pandas
import numpy as np
df = pd.DataFrame(np.random.rand(4,4), columns = list('abcd'))
df

       a	       b	       c	       d
0	0.774951	0.079351	0.118437	0.735799
1	0.615547	0.203062	0.437672	0.912781
2	0.804140	0.708514	0.156943	0.104416
3	0.226051	0.641862	0.739839	0.434230

after:

df[map(lambda x :x not in ['b'], list(df.columns))]

	    a	       c	      d
0	0.774951	0.118437	0.735799
1	0.615547	0.437672	0.912781
2	0.804140	0.156943	0.104416
3	0.226051	0.739839	0.434230

Solution 8 - Python

I think the best way to do is the way mentioned by @Salvador Dali. Not that the others are wrong.

Because when you have a data set where you just want to select one column and put it into one variable and the rest of the columns into another for comparison or computational purposes. Then dropping the column of the data set might not help. Of course there are use cases for that as well.

x_cols = [x for x in data.columns if x != 'name of column to be excluded']

Then you can put those collection of columns in variable x_cols into another variable like x_cols1 for other computation.

ex: x_cols1 = data[x_cols]

Solution 9 - Python

Another slight modification to @Salvador Dali enables a list of columns to exclude:

df[[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]

or

df.loc[:,[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]

Solution 10 - Python

I think a nice solution is with the function filter of pandas and regex (match everything except "b"):

df.filter(regex="^(?!b$)")

Solution 11 - Python

Similar to @Toms answer, it is also possible to select all columns except "b" without using .loc, like so:

df[df.columns[~df.columns.isin(['b'])]]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionmarkov zainView Question on Stackoverflow
Solution 1 - PythonMariusView Answer on Stackoverflow
Solution 2 - PythonmikeView Answer on Stackoverflow
Solution 3 - PythonayhanView Answer on Stackoverflow
Solution 4 - PythonSON_7093436View Answer on Stackoverflow
Solution 5 - PythonMykola ZotkoView Answer on Stackoverflow
Solution 6 - PythonSalvador DaliView Answer on Stackoverflow
Solution 7 - PythonGrant ShannonView Answer on Stackoverflow
Solution 8 - PythonSudhiView Answer on Stackoverflow
Solution 9 - Pythonuser1718097View Answer on Stackoverflow
Solution 10 - PythonBilly BonarosView Answer on Stackoverflow
Solution 11 - PythonDataBachView Answer on Stackoverflow