MySQL GROUP BY two columns

SqlMysqlDatabase

Sql Problem Overview


I'm trying to group by multiple columns here - one on each table.
It's a scenario where I want to find the top portfolio value for each client by adding their current portfolio and cash together but a client may have more than one portfolio, so I need the top portfolio for each client.

At the moment, with the code below I'm getting the same clients multiple times for each of their top portfolios (it's not grouping by client id).

SELECT clients.id, clients.name, portfolios.id, SUM ( portfolios.portfolio +  portfolios.cash ) AS total
FROM clients, portfolios
WHERE clients.id = portfolios.client_id
GROUP BY portfolios.id, clients.id
ORDER BY total DESC
LIMIT 30 

Sql Solutions


Solution 1 - Sql

First, let's make some test data:

create table client (client_id integer not null primary key auto_increment,
                     name varchar(64));
create table portfolio (portfolio_id integer not null primary key auto_increment,
                        client_id integer references client.id,
                        cash decimal(10,2),
                        stocks decimal(10,2));
insert into client (name) values ('John Doe'), ('Jane Doe');
insert into portfolio (client_id, cash, stocks) values (1, 11.11, 22.22),
                                                       (1, 10.11, 23.22),
                                                       (2, 30.30, 40.40),
                                                       (2, 40.40, 50.50);

If you didn't need the portfolio ID, it would be easy:

select client_id, name, max(cash + stocks)
from client join portfolio using (client_id)
group by client_id

+-----------+----------+--------------------+
| client_id | name     | max(cash + stocks) |
+-----------+----------+--------------------+
|         1 | John Doe |              33.33 | 
|         2 | Jane Doe |              90.90 | 
+-----------+----------+--------------------+

Since you need the portfolio ID, things get more complicated. Let's do it in steps. First, we'll write a subquery that returns the maximal portfolio value for each client:

select client_id, max(cash + stocks) as maxtotal
from portfolio
group by client_id

+-----------+----------+
| client_id | maxtotal |
+-----------+----------+
|         1 |    33.33 | 
|         2 |    90.90 | 
+-----------+----------+

Then we'll query the portfolio table, but use a join to the previous subquery in order to keep only those portfolios the total value of which is the maximal for the client:

 select portfolio_id, cash + stocks from portfolio 
 join (select client_id, max(cash + stocks) as maxtotal 
       from portfolio
       group by client_id) as maxima
 using (client_id)
 where cash + stocks = maxtotal

+--------------+---------------+
| portfolio_id | cash + stocks |
+--------------+---------------+
|            5 |         33.33 | 
|            6 |         33.33 | 
|            8 |         90.90 | 
+--------------+---------------+

Finally, we can join to the client table (as you did) in order to include the name of each client:

select client_id, name, portfolio_id, cash + stocks
from client
join portfolio using (client_id)
join (select client_id, max(cash + stocks) as maxtotal
      from portfolio 
      group by client_id) as maxima
using (client_id)
where cash + stocks = maxtotal

+-----------+----------+--------------+---------------+
| client_id | name     | portfolio_id | cash + stocks |
+-----------+----------+--------------+---------------+
|         1 | John Doe |            5 |         33.33 | 
|         1 | John Doe |            6 |         33.33 | 
|         2 | Jane Doe |            8 |         90.90 | 
+-----------+----------+--------------+---------------+

Note that this returns two rows for John Doe because he has two portfolios with the exact same total value. To avoid this and pick an arbitrary top portfolio, tag on a GROUP BY clause:

select client_id, name, portfolio_id, cash + stocks
from client
join portfolio using (client_id)
join (select client_id, max(cash + stocks) as maxtotal
      from portfolio 
      group by client_id) as maxima
using (client_id)
where cash + stocks = maxtotal
group by client_id, cash + stocks

+-----------+----------+--------------+---------------+
| client_id | name     | portfolio_id | cash + stocks |
+-----------+----------+--------------+---------------+
|         1 | John Doe |            5 |         33.33 | 
|         2 | Jane Doe |            8 |         90.90 | 
+-----------+----------+--------------+---------------+

Solution 2 - Sql

Using Concat on the group by will work

SELECT clients.id, clients.name, portfolios.id, SUM ( portfolios.portfolio + portfolios.cash ) AS total
FROM clients, portfolios
WHERE clients.id = portfolios.client_id
GROUP BY CONCAT(portfolios.id, "-", clients.id)
ORDER BY total DESC
LIMIT 30

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionChrisSView Question on Stackoverflow
Solution 1 - SqlVebjorn LjosaView Answer on Stackoverflow
Solution 2 - SqlJose' VargasView Answer on Stackoverflow