Postgresql aggregate array

SqlArraysPostgresql

Sql Problem Overview


I have a two tables

Student
--------
Id	Name
1	John	
2	David
3	Will

Grade
---------
Student_id	Mark
1			A
2			B
2			B+
3			C
3			A

Is it possible to make native Postgresql SELECT to get results like below:

Name	Array of marks
-----------------------
'John',     {'A'}
'David',    {'B','B+'}
'Will',     {'C','A'}

But not like below

Name	Mark
----------------
'John',     'A'
'David',    'B'
'David',    'B+'
'Will',     'C'
'Will',     'A'

Sql Solutions


Solution 1 - Sql

Use array_agg: http://www.sqlfiddle.com/#!1/5099e/1

SELECT s.name,  array_agg(g.Mark) as marks        
FROM student s
LEFT JOIN Grade g ON g.Student_id = s.Id
GROUP BY s.Id

By the way, if you are using Postgres 9.1, you don't need to repeat the columns on SELECT to GROUP BY, e.g. you don't need to repeat the student name on GROUP BY. You can merely GROUP BY on primary key. If you remove the primary key on student, you need to repeat the student name on GROUP BY.

CREATE TABLE grade
	(Student_id int, Mark varchar(2));

INSERT INTO grade
	(Student_id, Mark)
VALUES
	(1, 'A'),
	(2, 'B'),
	(2, 'B+'),
	(3, 'C'),
	(3, 'A');


CREATE TABLE student
	(Id int primary key, Name varchar(5));

INSERT INTO student
	(Id, Name)
VALUES
	(1, 'John'),
	(2, 'David'),
	(3, 'Will');

Solution 2 - Sql

What I understand you can do something like this:

SELECT p.p_name, 
    STRING_AGG(Grade.Mark, ',' ORDER BY Grade.Mark) As marks
FROM Student
LEFT JOIN Grade ON Grade.Student_id = Student.Id
GROUP BY Student.Name;

EDIT

I am not sure. But maybe something like this then:

SELECT p.p_name, 
    array_to_string(ARRAY_AGG(Grade.Mark),';') As marks
FROM Student
LEFT JOIN Grade ON Grade.Student_id = Student.Id
GROUP BY Student.Name;

Reference here

Solution 3 - Sql

You could use the following:

SELECT Student.Name as Name,
       (SELECT array(SELECT Mark FROM Grade WHERE Grade.Student_id = Student.Id))
       AS ArrayOfMarks 
FROM Student

As described here: http://www.mkyong.com/database/convert-subquery-result-to-array/

Solution 4 - Sql

@Michael Buen got it right. I got what I needed using array_agg.

Here just a basic query example in case it helps someone:

SELECT directory, ARRAY_AGG(file_name) FROM table WHERE type = 'ZIP' GROUP BY directory;

And the result was something like:

parent_directory | array_agg | ------------------------+----------------------------------------+ /home/postgresql/files | {zip_1.zip,zip_2.zip,zip_3.zip} | /home/postgresql/files2 | {file1.zip,file2.zip} |


This post also helped me a lot: "Group By" in SQL and Python Pandas. It basically says that it is more convenient to use only PSQL when possible, but that Python Pandas can be useful to achieve extra functionalities in the filtering process.

I hope it helps

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEazyView Question on Stackoverflow
Solution 1 - SqlMichael BuenView Answer on Stackoverflow
Solution 2 - SqlArionView Answer on Stackoverflow
Solution 3 - SqlnikoView Answer on Stackoverflow
Solution 4 - SqlJaviView Answer on Stackoverflow