Group dataframe and get sum AND count?

PythonPandasDataframeGroup ByPandas Groupby

Python Problem Overview


I have a dataframe that looks like this:

              Company Name              Organisation Name  Amount
10118  Vifor Pharma UK Ltd  Welsh Assoc for Gastro & Endo 2700.00
10119  Vifor Pharma UK Ltd    Welsh IBD Specialist Group,  169.00
10120  Vifor Pharma UK Ltd             West Midlands AHSN 1200.00
10121  Vifor Pharma UK Ltd           Whittington Hospital   63.00
10122  Vifor Pharma UK Ltd                 Ysbyty Gwynedd   75.93

How do I sum the Amount and count the Organisation Name, to get a new dataframe that looks like this?

              Company Name             Organisation Count   Amount
10118  Vifor Pharma UK Ltd                              5 11000.00

I know how to sum or count:

df.groupby('Company Name').sum()
df.groupby('Company Name').count()

But not how to do both!

Python Solutions


Solution 1 - Python

try this:

In [110]: (df.groupby('Company Name')
   .....:    .agg({'Organisation Name':'count', 'Amount': 'sum'})
   .....:    .reset_index()
   .....:    .rename(columns={'Organisation Name':'Organisation Count'})
   .....: )
Out[110]:
          Company Name   Amount  Organisation Count
0  Vifor Pharma UK Ltd  4207.93                   5

or if you don't want to reset index:

df.groupby('Company Name')['Amount'].agg(['sum','count'])

or

df.groupby('Company Name').agg({'Amount': ['sum','count']})

Demo:

In [98]: df.groupby('Company Name')['Amount'].agg(['sum','count'])
Out[98]:
                         sum  count
Company Name
Vifor Pharma UK Ltd  4207.93      5

In [99]: df.groupby('Company Name').agg({'Amount': ['sum','count']})
Out[99]:
                      Amount
                         sum count
Company Name
Vifor Pharma UK Ltd  4207.93     5

Solution 2 - Python

Just in case you were wondering how to rename columns during aggregation, here's how for

pandas >= 0.25: Named Aggregation
df.groupby('Company Name')['Amount'].agg(MySum='sum', MyCount='count')

Or,

df.groupby('Company Name').agg(MySum=('Amount', 'sum'), MyCount=('Amount', 'count'))

                       MySum  MyCount
Company Name                       
Vifor Pharma UK Ltd  4207.93        5

Solution 3 - Python

If you have lots of columns and only one is different you could do:

In[1]: grouper = df.groupby('Company Name')
In[2]: res = grouper.count()
In[3]: res['Amount'] = grouper.Amount.sum()
In[4]: res
Out[4]:
                      Organisation Name   Amount
Company Name                                   
Vifor Pharma UK Ltd                  5  4207.93

Note you can then rename the Organisation Name column as you wish.

Solution 4 - Python

df.groupby('Company Name').agg({'Organisation name':'count','Amount':'sum'})\
    .apply(lambda x: x.sort_values(['count','sum'], ascending=False))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRichardView Question on Stackoverflow
Solution 1 - PythonMaxU - stop genocide of UAView Answer on Stackoverflow
Solution 2 - Pythoncs95View Answer on Stackoverflow
Solution 3 - PythonJSharmView Answer on Stackoverflow
Solution 4 - PythoncvsnowView Answer on Stackoverflow