How to estimate the size of one column in a Postgres table?

Postgresql

Postgresql Problem Overview


There is a column of type text in a table in Postgres 9.1. I'd like to know the impact of just that column on the disk space needed. It doesn't need to be precise, but I'd like to get an idea if that column is responsible for 20%/30%/... of the disk space consumed by the database.

I know pg_relation_size, but it only operates at table level.

I have many databases with this same schema. I dumped a smaller one and cut out the column with grep and cut and compared the size of the plain text dumps. But this is not necessarily a good indicator of space requirements in the live db, and it's also more difficult to do that for large databases.

Postgresql Solutions


Solution 1 - Postgresql

select
    sum(pg_column_size(the_text_column)) as total_size,
    avg(pg_column_size(the_text_column)) as average_size,
    sum(pg_column_size(the_text_column)) * 100.0 / pg_relation_size('t') as percentage
from t;

Solution 2 - Postgresql

Slight improvement on the accepted answer: pretty print the size and use pg_total_relation_size to be more accurate.

select
    pg_size_pretty(sum(pg_column_size(column_name))) as total_size,
    pg_size_pretty(avg(pg_column_size(column_name))) as average_size,
    sum(pg_column_size(column_name)) * 100.0 / pg_total_relation_size('table_name') as percentage
from table_name;

Solution 3 - Postgresql

If you want a report for all the columns in a database sorted by size then here is the way

BEGIN;
CREATE FUNCTION tc_column_size(table_name text, column_name text)
    RETURNS BIGINT AS
$$
    declare response BIGINT;
BEGIN
    EXECUTE 'select sum(pg_column_size(t."' || column_name || '")) from ' || table_name || ' t ' into response;
    return response;
END;
$$
    LANGUAGE plpgsql;

SELECT
	z.table_name,
	z.column_name,
	pg_size_pretty(z.size)
FROM (
	SELECT
		table_name,
		column_name,
		tc_column_size(table_name, column_name) size
	FROM
		information_schema.columns
	WHERE
		table_schema = 'public') AS z
WHERE
	size IS NOT NULL
	-- and z.table_name = 'my_table' -- <--- uncomment to filter a table
ORDER BY
	z.size DESC;

ROLLBACK; -- <--- You may not want to keep that function

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionThomas KapplerView Question on Stackoverflow
Solution 1 - PostgresqlClodoaldo NetoView Answer on Stackoverflow
Solution 2 - PostgresqlDenis VermylenView Answer on Stackoverflow
Solution 3 - PostgresqlStan SokolovView Answer on Stackoverflow