Given a pandas Series that represents frequencies of a value, how can I turn those frequencies into percentages?

PythonPandas

Python Problem Overview


I was experimenting with the kaggle.com Titanic data set (data on every person on the Titanic) and came up with a gender breakdown like this:

df = pd.DataFrame({'sex': ['male'] * 577 + ['female'] * 314})
gender = df.sex.value_counts()
gender

male   577
female 314 

I would like to find out the percentage of each gender on the Titanic.

My approach is slightly less than ideal:

from __future__ import division
pcts = gender / gender.sum()
pcts

male      0.647587
female    0.352413

Is there a better (more idiomatic) way?

Python Solutions


Solution 1 - Python

This function is implemented in pandas, actually even in value_counts(). No need to calculate :)

just type:

df.sex.value_counts(normalize=True)

which gives exactly the desired output.

Please note that value_counts() excludes NA values, so numbers might not add up to 1. See here: http://pandas-docs.github.io/pandas-docs-travis/generated/pandas.Series.value_counts.html (A column of a DataFrame is a Series)

Solution 2 - Python

In case you wish to show percentage one of the things that you might do is use value_counts(normalize=True) as answered by @fanfabbb.

With that said, for many purposes, you might want to show it in the percentage out of a hundred.

That can be achieved like so:

gender = df.sex.value_counts(normalize=True).mul(100).round(1).astype(str) + '%'

In this case, we multiply the results by hundred, round it to one decimal point and add the percentage sign.

Solution 3 - Python

If you want to merge counts with percentage, can use:

c = df.sex.value_counts(dropna=False)
p = df.sex.value_counts(dropna=False, normalize=True)
pd.concat([c,p], axis=1, keys=['counts', '%'])

Solution 4 - Python

I think I would probably do this in one go (without importing division):

1. * df.sex.value_counts() / len(df.sex)

or perhaps, remembering you want a percentage:

100. * df.sex.value_counts() / len(df.sex)

Much of a muchness really, your way looks fine too.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTim StewartView Question on Stackoverflow
Solution 1 - PythonfanfabbbView Answer on Stackoverflow
Solution 2 - PythonShaharView Answer on Stackoverflow
Solution 3 - PythonLaylaView Answer on Stackoverflow
Solution 4 - PythonAndy HaydenView Answer on Stackoverflow