Is there a simple way to change a column of yes/no to 1/0 in a Pandas dataframe?

Python Problem Overview

I read a csv file into a pandas dataframe, and would like to convert the columns with binary answers from strings of yes/no to integers of 1/0. Below, I show one of such columns ("sampleDF" is the pandas dataframe).

In [13]: sampleDF.housing[0:10]
Out[13]:
0     no
1     no
2    yes
3     no
4     no
5     no
6     no
7     no
8    yes
9    yes
Name: housing, dtype: object

Help is much appreciated!

Python Solutions

Solution 1 - Python

method 1

sample.housing.eq('yes').mul(1)

method 2

pd.Series(np.where(sample.housing.values == 'yes', 1, 0),
          sample.index)

method 3

sample.housing.map(dict(yes=1, no=0))

method 4

pd.Series(map(lambda x: dict(yes=1, no=0)[x],
              sample.housing.values.tolist()), sample.index)

method 5

pd.Series(np.searchsorted(['no', 'yes'], sample.housing.values), sample.index)

All yield

timing
given sample

timing
long sample
sample = pd.DataFrame(dict(housing=np.random.choice(('yes', 'no'), size=100000)))

Solution 2 - Python

Try this:

sampleDF['housing'] = sampleDF['housing'].map({'yes': 1, 'no': 0})

Solution 3 - Python

# produces True/False
sampleDF['housing'] = sampleDF['housing'] == 'yes'

The above returns True/False values which are essentially 1/0, respectively. Booleans support sum functions, etc. If you really need it to be 1/0 values, you can use the following.

housing_map = {'yes': 1, 'no': 0}
sampleDF['housing'] = sampleDF['housing'].map(housing_map)

Solution 4 - Python

%timeit
sampleDF['housing'] = sampleDF['housing'].apply(lambda x: 0 if x=='no' else 1)

1.84 ms ± 56.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

>>> Replaces 'yes' with 1, 'no' with 0 for the df column specified.

Solution 5 - Python

Use sklearn's LabelEncoder

from sklearn.preprocessing import LabelEncoder

lb = LabelEncoder() 
sampleDF['housing'] = lb.fit_transform(sampleDF['housing'])

Source

Solution 6 - Python

yes there is you can change yes/no values of your column to 1/0 by using following code snippet

sampleDF = sampleDF.replace(to_replace = ['yes','no'],value = ['1','0'])
sampleDF

by using first line you can replace the values with 1/0 by using second line you can see the changes by printing it

Solution 7 - Python

For a dataset names data and a column named Paid;

data = data.replace({'Paid': {'yes': 1, 'no': 0}})

all the yes will change to 1 and all the no will be replaced by 0

Solution 8 - Python

Generic way:

import pandas as pd
string_data = string_data.astype('category')
numbers_data = string_data.cat.codes

reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.astype.html

Solution 9 - Python

You can convert a series from Boolean to integer explicitly:

sampleDF['housing'] = sampleDF['housing'].eq('yes').astype(int)

Solution 10 - Python

The easy way to do that use pandas as below:

housing = pd.get_dummies(sampleDF['housing'],drop_first=True)

after that drop this filed from main df

sampleDF.drop('housing',axis=1,inplace=True)

now merge new one in you df

sampleDF= pd.concat([sampleDF,housing ],axis=1)

Solution 11 - Python

A simple and intuitive way to convert the whole dataframe to 0's and 1's might be:

sampleDF = sampleDF.replace(to_replace = "yes", value = 1)
sampleDF = sampleDF.replace(to_replace = "no", value = 0)

Solution 12 - Python

sampleDF['housing'] = sampleDF['housing'].map(lambda x: 1 if x == 'yes' else 0)
sampleDF['housing'] = sampleDF['housing'].astype(int)

This will work.

Solution 13 - Python

Try the following:

sampleDF['housing'] = sampleDF['housing'].str.lower().replace({'yes': 1, 'no': 0})

Solution 14 - Python

I used the preprocesssing function from sklearn. First you create an encoder.

e = preprocessing.LabelEncoder()

Then for each attribute or characteristic in the data use the label encoder to transform it into an integer value

size = le.fit_transform(list(data["size"]))
color = le.fit_transform(list(data["color"]))

It's converting a list of all the "size" or "color" attributes, and converting that into a list of their corresponding integer values. To put all of this into one list, use the zip function.

It is not going to be in the same format as the csv file; it will be a giant list of everything.

data = list(zip(buying, size))

Hopefully I explained that somewhat clearly.

Solution 15 - Python

You can also try :

sampleDF["housing"] = (sampleDF["housing"]=="Yes")*1

Solution 16 - Python

This is just a bool to int.

Try this.

sampleDF.housing = (sampleDF.housing == 'yes').astype(int)

Solution 17 - Python

Try this, it will work.

sampleDF.housing.replace(['no', 'yes'], [0,1], inplace = True)

Solution 18 - Python

use pandas.Series.map

sampleDF.map({'yes':1,'no':0})

Solution 19 - Python

comprehension array

sampleDF['housing'] = [int(v == 'yes') for v in sampleDF['housing']]

Content Type	Original Author	Original Content on Stackoverflow
Question	Mushu909	View Question on Stackoverflow
Solution 1 - Python	piRSquared	View Answer on Stackoverflow
Solution 2 - Python	gold_cy	View Answer on Stackoverflow
Solution 3 - Python	3novak	View Answer on Stackoverflow
Solution 4 - Python	SriramKRaju	View Answer on Stackoverflow
Solution 5 - Python	Freek Nortier	View Answer on Stackoverflow
Solution 6 - Python	praveen kumar	View Answer on Stackoverflow
Solution 7 - Python	muli	View Answer on Stackoverflow
Solution 8 - Python	Siddaram H	View Answer on Stackoverflow
Solution 9 - Python	jpp	View Answer on Stackoverflow
Solution 10 - Python	Eslamspot	View Answer on Stackoverflow
Solution 11 - Python	Josmy Faure	View Answer on Stackoverflow
Solution 12 - Python	Benny06	View Answer on Stackoverflow
Solution 13 - Python	Sazzad	View Answer on Stackoverflow
Solution 14 - Python	Turkey	View Answer on Stackoverflow
Solution 15 - Python	Şenol Kurt	View Answer on Stackoverflow
Solution 16 - Python	DeepBlue	View Answer on Stackoverflow
Solution 17 - Python	Nija I Pillai	View Answer on Stackoverflow
Solution 18 - Python	Manikiran Bodepudi	View Answer on Stackoverflow
Solution 19 - Python	Tolstoïevski	View Answer on Stackoverflow