how to find size of database, schema, table in redshift

Amazon Web-Services Problem Overview

Team,

my redshift version is:

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.735

how to find out database size, tablespace, schema size & table size ?

but below are not working in redshift ( for above version )

SELECT pg_database_size('db_name');
SELECT pg_size_pretty( pg_relation_size('table_name') );

Is there any alternate to find out like oracle ( from DBA_SEGMENTS )

for tble size, i have below query, but not sure about exact menaing of MBYTES. FOR 3rd row, MBYTES = 372. it means 372 MB ?

select trim(pgdb.datname) as Database, trim(pgn.nspname) as Schema,
trim(a.name) as Table, b.mbytes, a.rows
from ( select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name ) as a
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, count(*) as mbytes
from stv_blocklist group by tbl) b on a.id=b.tbl
order by a.db_id, a.name;
   database    |    schema    |      table       | mbytes |   rows
---------------+--------------+------------------+--------+----------
      postgres | public       | company          |      8 |        1
      postgres | public       | table_data1_1    |      7 |        1
      postgres | proj_schema1 | table_data1	   |    372 | 33867540
      postgres | public       | table_data1_2    |     40 |  2000001
 
(4 rows)

Amazon Web-Services Solutions

Solution 1 - Amazon Web-Services

The above answers don't always give correct answers for table space used. AWS support have given this query to use:

SELECT   TRIM(pgdb.datname) AS Database,
         TRIM(a.name) AS Table,
         ((b.mbytes/part.total::decimal)*100)::decimal(5,2) AS pct_of_total,
         b.mbytes,
         b.unsorted_mbytes
FROM     stv_tbl_perm a
JOIN     pg_database AS pgdb
  ON     pgdb.oid = a.db_id
JOIN     ( SELECT   tbl,
                    SUM( DECODE(unsorted, 1, 1, 0)) AS unsorted_mbytes,
                    COUNT(*) AS mbytes
           FROM     stv_blocklist
           GROUP BY tbl ) AS b
       ON a.id = b.tbl
JOIN     ( SELECT SUM(capacity) AS total
           FROM   stv_partitions
           WHERE  part_begin = 0 ) AS part
      ON 1 = 1
WHERE    a.slice = 0
ORDER BY 4 desc, db_id, name;

Solution 2 - Amazon Web-Services

Yes, mbytes in your example is 372Mb. Here's what I've been using:

select
  cast(use2.usename as varchar(50)) as owner, 
  pgc.oid,
  trim(pgdb.datname) as Database,
  trim(pgn.nspname) as Schema,
  trim(a.name) as Table,
  b.mbytes,
  a.rows
from 
 (select db_id, id, name, sum(rows) as rows
  from stv_tbl_perm a
  group by db_id, id, name
  ) as a
 join pg_class as pgc on pgc.oid = a.id
 left join pg_user use2 on (pgc.relowner = use2.usesysid)
 join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
    and pgn.nspowner > 1
 join pg_database as pgdb on pgdb.oid = a.db_id
 join 
   (select tbl, count(*) as mbytes
    from stv_blocklist
    group by tbl
   ) b on a.id = b.tbl
 order by mbytes desc, a.db_id, a.name;

Solution 3 - Amazon Web-Services

I'm not sure about grouping by database and scheme, but here's a short way to get usage by table,

SELECT tbl, name, size_mb FROM
(
  SELECT tbl, count(*) AS size_mb
  FROM stv_blocklist
  GROUP BY tbl
)
LEFT JOIN
(select distinct id, name FROM stv_tbl_perm)
ON id = tbl
ORDER BY size_mb DESC
LIMIT 10;

Solution 4 - Amazon Web-Services

you can checkout this repository, i'm sure you'll find useful stuff there.

https://github.com/awslabs/amazon-redshift-utils

to answer your question you can use this view: https://github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_space_used_per_tbl.sql

and then query as you like. e.g: select * from admin.v_space_used_per_tbl;

Solution 5 - Amazon Web-Services

Modified versions of one of the other answers. This includes database name, schema name, table name, total row count, size on disk and unsorted size:

-- sort by row count
select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
    c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
    from stv_tbl_perm a
    join pg_class as pgtbl on pgtbl.oid = a.id
    join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
    join pg_database as pgdb on pgdb.oid = a.db_id
    join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
    join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
    join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
    where a.slice=0
    order by 4 desc, db_id, name;


-- sort by space used
select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
    c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
    from stv_tbl_perm a
    join pg_class as pgtbl on pgtbl.oid = a.id
    join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
    join pg_database as pgdb on pgdb.oid = a.db_id
    join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
    join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
    join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
    where a.slice=0
    order by 6 desc, db_id, name;

Solution 6 - Amazon Web-Services

This query is much easier:

-- List the Top 30 largest tables on your cluster

SELECT 
 "schema"
,"table"  AS table_name
,ROUND((size/1024.0),2) AS "Size in Gigabytes"
,pct_used AS "Physical Disk Used by This Table"
FROM svv_table_info
ORDER BY pct_used DESC
LIMIT 30;

Solution 7 - Amazon Web-Services

SVV_TABLE_INFO is a Redshift systems table that shows information about user-defined tables (not other system tables) in a Redshift database. The table is only visible to superusers.

To get the size of each table, run the following command on your Redshift cluster:

SELECT "table", size, tbl_rows 
FROM SVV_TABLE_INFO

The table column is the table name.
The size column is the size of the table in MB.
The tbl_rows column is the total number of rows in the table, including rows that have been marked for deletion but not yet vacuumed.

Source

Look at SVV_TABLE_INFO Redshift documentation for other interesting columns to retrieve from this system table.

Solution 8 - Amazon Web-Services

This is what I am using(please change the databasename from 'mydb' to your database name) :

SELECT CAST(use2.usename AS VARCHAR(50)) AS OWNER
 ,TRIM(pgdb.datname) AS DATABASE
 ,TRIM(pgn.nspname) AS SCHEMA
 ,TRIM(a.NAME) AS TABLE
 ,(b.mbytes) / 1024 AS Gigabytes
 ,a.ROWS
FROM (
 SELECT db_id
 ,id
 ,NAME
 ,SUM(ROWS) AS ROWS
 FROM stv_tbl_perm a
 GROUP BY db_id
 ,id
 ,NAME
 ) AS a
JOIN pg_class AS pgc ON pgc.oid = a.id
LEFT JOIN pg_user use2 ON (pgc.relowner = use2.usesysid)
JOIN pg_namespace AS pgn ON pgn.oid = pgc.relnamespace
 AND pgn.nspowner > 1
JOIN pg_database AS pgdb ON pgdb.oid = a.db_id
JOIN (
 SELECT tbl
 ,COUNT(*) AS mbytes
 FROM stv_blocklist
 GROUP BY tbl
 ) b ON a.id = b.tbl
WHERE pgdb.datname = 'mydb'
ORDER BY mbytes DESC
 ,a.db_id
 ,a.NAME;

src: https://aboutdatabases.wordpress.com/2015/01/24/amazon-redshift-how-to-get-the-sizes-of-all-tables/

Content Type	Original Author	Original Content on Stackoverflow
Question	user3258784	View Question on Stackoverflow
Solution 1 - Amazon Web-Services	imcdnzl	View Answer on Stackoverflow
Solution 2 - Amazon Web-Services	mike_pdb	View Answer on Stackoverflow
Solution 3 - Amazon Web-Services	gatoatigrado	View Answer on Stackoverflow
Solution 4 - Amazon Web-Services	kerbelp	View Answer on Stackoverflow
Solution 5 - Amazon Web-Services	Nate Sammons	View Answer on Stackoverflow
Solution 6 - Amazon Web-Services	Qiushuo Yu	View Answer on Stackoverflow
Solution 7 - Amazon Web-Services	Vzzarr	View Answer on Stackoverflow
Solution 8 - Amazon Web-Services	Rakesh Singh	View Answer on Stackoverflow