Split column into multiple rows in Postgres
SqlPostgresqlSplitSet Returning-FunctionsSql Problem Overview
Suppose I have a table like this:
subject | flag
----------------+------
this is a test | 2
subject
is of type text
, and flag
is of type int
. I would like to transform this table to something like this in Postgres:
token | flag
----------------+------
this | 2
is | 2
a | 2
test | 2
Is there an easy way to do this?
Sql Solutions
Solution 1 - Sql
In Postgres 9.3+ use a LATERAL
join. Minimal form:
SELECT token, flag
FROM tbl, unnest(string_to_array(subject, ' ')) token
WHERE flag = 2;
The comma in the FROM
list is (almost) equivalent to CROSS JOIN
, LATERAL
is automatically assumed for set-returning functions (SRF) in the FROM
list. Why "almost"? See:
The alias "token" for the derived table is also assumed as column alias for a single anonymous column, and we assumed distinct column names across the query. Equivalent, more verbose and less error-prone:
SELECT s.token, t.flag
FROM tbl t
CROSS JOIN LATERAL unnest(string_to_array(subject, ' ')) AS s(token)
WHERE t.flag = 2;
Or move the SRF to the SELECT
list, which is allowed in Postgres (but not in standard SQL), to the same effect:
SELECT unnest(string_to_array(subject, ' ')) AS token, flag
FROM tbl
WHERE flag = 2;
The last one seems acceptable since SRF in the SELECT
list have been sanitized in Postgres 10. See:
If unnest()
does not return any rows (empty or NULL subject
), the (implicit) join eliminates the row from the result. Use LEFT JOIN ... ON true
to keep qualifying rows from tbl
. See:
We could also use regexp_split_to_table()
, but that's typically slower because regular expressions cost a bit more. See:
Solution 2 - Sql
I think it's not necessary to use a join, just the unnest()
function in conjunction with string_to_array()
should do it:
SELECT unnest(string_to_array(subject, ' ')) as "token", flag FROM test;
token | flag
-------+-------
this | 2
is | 2
a | 2
test | 2
Solution 3 - Sql
Using regex split to table function including lateral join,
SELECT s.token, flag
FROM tbl t, regexp_split_to_table(t.subject, ' ') s(token)
WHERE flag = 2;
Refer to https://www.postgresql.org/docs/9.3/functions-string.html for the function details