how to exclude null values in array_agg like in string_agg using postgres?

SqlPostgresqlPostgresql 9.1Postgresql 8.4

Sql Problem Overview


If I use array_agg to collect names, I get my names separated by commas, but in case there is a null value, that null is also taken as a name in the aggregate. For example :

SELECT g.id,
       array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END) canonical_users,
       array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END) non_canonical_users
FROM groups g
GROUP BY g.id;

it returns ,Larry,Phil instead of just Larry,Phil (in my 9.1.2, it shows NULL,Larry,Phil).

Instead, if I use string_agg(), it shows me only the names (without empty commas or nulls).

The problem is that I have Postgres 8.4 installed on the server, and string_agg() doesn't work there. Is there any way to make array_agg work similar to string_agg() ?

Sql Solutions


Solution 1 - Sql

With postgresql-9.3 one can do this;

SELECT g.id,
   array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
   array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g 
GROUP BY g.id;

Update: with postgresql-9.4;

SELECT g.id,
   array_agg(g.users) FILTER (WHERE g.canonical = 'Y') canonical_users,
   array_agg(g.users) FILTER (WHERE g.canonical = 'N') non_canonical_users
FROM groups g 
GROUP BY g.id;

Update (2022-02-19): also with postgresql-9.4;

This results in an empty array when all values in an array are null instead of returning null;

SELECT g.id,
  coalesce( array_agg(g.users) FILTER (WHERE g.canonical = 'Y'), '{}' ) canonical_users,
  coalesce( array_agg(g.users) FILTER (WHERE g.canonical = 'N'), '{}' ) non_canonical_users
FROM groups g 
GROUP BY g.id;

Solution 2 - Sql

If you are looking for a modern answer to the general question of how to remove a NULL from an array, it is:

array_remove(your_array, NULL)

I was specifically curious about performance and wanted to compare this to the best possible alternative:

CREATE OR REPLACE FUNCTION strip_nulls(
	IN array_in ANYARRAY
)
RETURNS anyarray AS
'
SELECT
	array_agg(a)
FROM unnest(array_in) a
WHERE
	a IS NOT NULL
;
'
LANGUAGE sql
;

Doing a pgbench test proved (with high confidence) that array_remove() is a little more than twice as fast. I did my test on double precision numbers with a variety of array sizes (10, 100 and 1000 elements) and random NULLs in between.


It's also worth noting that this can be used to remove blanks ('' != NULL). But the second parameter accepts anyelement, and since it is most likely they you'd be indicating a blank with a string literal, make sure to cast it to the form you want, usually a non-array.

For example:

select array_remove(array['abc', ''], ''::text);

If you try:

select array_remove(array['abc', ''], '');

it will assume that the '' is TEXT[] (array) and will throw this error:

> ERROR: malformed array literal: ""

Solution 3 - Sql

select
    id,
    (select array_agg(a) from unnest(canonical_users) a where a is not null) canonical_users,
    (select array_agg(a) from unnest(non_canonical_users) a where a is not null) non_canonical_users
from (
    SELECT g.id,
           array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END) canonical_users,
           array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END) non_canonical_users
    FROM groups g
    GROUP BY g.id
) s

Or, simpler and may be cheaper, using array_to_string which eliminates nulls:

SELECT
    g.id,
    array_to_string(
        array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END)
        , ','
    ) canonical_users,
    array_to_string(
        array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END)
        , ','
    ) non_canonical_users
FROM groups g
GROUP BY g.id

Solution 4 - Sql

In solving the general question of removing nulls from array aggregates there are two main ways of attacking the problem: either doing array_agg(unnest(array_agg(x)) or creating a custom aggregate.

The first is of the form shown above:

SELECT 
    array_agg(u) 
FROM (
    SELECT 
        unnest(
            array_agg(v)
        ) as u 
    FROM 
        x
    ) un
WHERE 
    u IS NOT NULL;

The second:

/*
With reference to
http://ejrh.wordpress.com/2011/09/27/denormalisation-aggregate-function-for-postgresql/
*/
CREATE OR REPLACE FUNCTION fn_array_agg_notnull (
    a anyarray
    , b anyelement
) RETURNS ANYARRAY
AS $$
BEGIN

    IF b IS NOT NULL THEN
        a := array_append(a, b);
    END IF;

    RETURN a;

END;
$$ IMMUTABLE LANGUAGE 'plpgsql';

CREATE AGGREGATE array_agg_notnull(ANYELEMENT) (
    SFUNC = fn_array_agg_notnull,
    STYPE = ANYARRAY,
    INITCOND = '{}'
);

Calling the second is (naturally) a little nicer looking than the first:

> select array_agg_notnull(v) from x;

Solution 5 - Sql

I am adding this even though this this thread is quite old, but I ran into this neat trick that works quite well on small arrays. It runs on Postgres 8.4+ without additional libraries or functions.

string_to_array(array_to_string(array_agg(my_column)))::int[]

The array_to_string() method actually gets rid of the nulls.

Solution 6 - Sql

You should wrap your array_agg with array_remove.

SELECT g.id,
       array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
       array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g
GROUP BY g.id;

Solution 7 - Sql

As has been suggested in the comments you can write a function to replace nulls in an array, however as also pointed out in the thread linked to in the comments, this kind of defeats the efficiency of the aggregate function if you have to create an aggregate, split it then aggregate it again.

I think keeping nulls in the array is just a (perhaps unwanted) feature of Array_Agg. You could use subqueries to avoid this:

SELECT	COALESCE(y.ID, n.ID) ID,
		y.Users,
		n.Users
FROM	(	SELECT	g.ID, ARRAY_AGG(g.Users) AS Users
			FROM	Groups g
			WHERE	g.Canonical = 'Y'
			GROUP BY g.ID
		) y
		FULL JOIN 
		(	SELECT	g.ID, ARRAY_AGG(g.Users) AS Users
			FROM	Groups g
			WHERE	g.Canonical = 'N'
			GROUP BY g.ID
		) n
			ON n.ID = y.ID

http://sqlfiddle.com/#!1/7f0e9/3">SQL FIDDLE

Solution 8 - Sql

It is very simple, just first of all create a new - (minus) operator for text[]:

CREATE OR REPLACE FUNCTION diff_elements_text
	(
		text[], text[] 
	)
RETURNS text[] as 
$$
	SELECT array_agg(DISTINCT new_arr.elem)
	FROM
		unnest($1) as new_arr(elem)
		LEFT OUTER JOIN
		unnest($2) as old_arr(elem)
		ON new_arr.elem = old_arr.elem
	WHERE old_arr.elem IS NULL
$$ LANGUAGE SQL IMMUTABLE;

CREATE OPERATOR - (
	PROCEDURE = diff_elements_text,
	leftarg = text[],
	rightarg = text[]
);

And simply subtract the array[null]:

select 
	array_agg(x)-array['']
from
	(	select 'Y' x union all
		select null union all
		select 'N' union all
		select '' 
	) x;

That's all:

{Y, N}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDaudView Question on Stackoverflow
Solution 1 - SqlDale O'BrienView Answer on Stackoverflow
Solution 2 - SqlAlexi TheodoreView Answer on Stackoverflow
Solution 3 - SqlClodoaldo NetoView Answer on Stackoverflow
Solution 4 - SqlroryclView Answer on Stackoverflow
Solution 5 - Sqlced-bView Answer on Stackoverflow
Solution 6 - SqlLuluView Answer on Stackoverflow
Solution 7 - SqlGarethDView Answer on Stackoverflow
Solution 8 - SqlMiklosView Answer on Stackoverflow