Copy a table (including indexes) in postgres

SqlPostgresqlIndexing

Sql Problem Overview


I have a postgres table. I need to delete some data from it. I was going to create a temporary table, copy the data in, recreate the indexes and the delete the rows I need. I can't delete data from the original table, because this original table is the source of data. In one case I need to get some results that depends on deleting X, in another case, I'll need to delete Y. So I need all the original data to always be around and available.

However it seems a bit silly to recreate the table and copy it again and recreate the indexes. Is there anyway in postgres to tell it "I want a complete separate copy of this table, including structure, data and indexes"?

Unfortunately PostgreSQL does not have a "CREATE TABLE .. LIKE X INCLUDING INDEXES'

Sql Solutions


Solution 1 - Sql

New PostgreSQL ( since 8.3 according to docs ) can use "INCLUDING INDEXES":

# select version();
                                             version
-------------------------------------------------------------------------------------------------
 PostgreSQL 8.3.7 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu3)
(1 row)

As you can see I'm testing on 8.3.

Now, let's create table:

# create table x1 (id serial primary key, x text unique);
NOTICE:  CREATE TABLE will create implicit sequence "x1_id_seq" for serial column "x1.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "x1_pkey" for table "x1"
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "x1_x_key" for table "x1"
CREATE TABLE

And see how it looks:

# \d x1
                         Table "public.x1"
 Column |  Type   |                    Modifiers
--------+---------+-------------------------------------------------
 id     | integer | not null default nextval('x1_id_seq'::regclass)
 x      | text    |
Indexes:
    "x1_pkey" PRIMARY KEY, btree (id)
    "x1_x_key" UNIQUE, btree (x)

Now we can copy the structure:

# create table x2 ( like x1 INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES );
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "x2_pkey" for table "x2"
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "x2_x_key" for table "x2"
CREATE TABLE

And check the structure:

# \d x2
                         Table "public.x2"
 Column |  Type   |                    Modifiers
--------+---------+-------------------------------------------------
 id     | integer | not null default nextval('x1_id_seq'::regclass)
 x      | text    |
Indexes:
    "x2_pkey" PRIMARY KEY, btree (id)
    "x2_x_key" UNIQUE, btree (x)

If you are using PostgreSQL pre-8.3, you can simply use pg_dump with option "-t" to specify 1 table, change table name in dump, and load it again:

=> pg_dump -t x2 | sed 's/x2/x3/g' | psql
SET
SET
SET
SET
SET
SET
SET
SET
CREATE TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE

And now the table is:

# \d x3
                         Table "public.x3"
 Column |  Type   |                    Modifiers
--------+---------+-------------------------------------------------
 id     | integer | not null default nextval('x1_id_seq'::regclass)
 x      | text    |
Indexes:
    "x3_pkey" PRIMARY KEY, btree (id)
    "x3_x_key" UNIQUE, btree (x)

Solution 2 - Sql

[CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } ] TABLE table_name
    [ (column_name [, ...] ) ]
    [ WITH ( storage_parameter [= value] [, ... ] ) | WITH OIDS | WITHOUT OIDS ]
    [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
    [ TABLESPACE tablespace ]
    AS query][1]  

Here is an example

CREATE TABLE films_recent AS
  SELECT * FROM films WHERE date_prod >= '2002-01-01';

The other way to create a new table from the first is to use

    CREATE TABLE films_recent (LIKE films INCLUDING INDEXES);  

    INSERT INTO films_recent
         SELECT *
           FROM books
          WHERE date_prod >= '2002-01-01';  

Note that Postgresql has a patch out to fix tablespace issues if the second method is used

Solution 3 - Sql

There are many answers on the web, one of them can be found here.

I ended up doing something like this:

create table NEW ( like ORIGINAL including all);
insert into NEW select * from ORIGINAL

This will copy the schema and the data including indexes, but not including triggers and constraints. Note that indexes are shared with original table so when adding new row to either table the counter will increment.

Solution 4 - Sql

> I have a postgres table. I need to > delete some data from it.

I presume that ...

delete from yourtable
where <condition(s)>

... won't work for some reason. (Care to share that reason?)

> I was going to create a temporary > table, copy the data in, recreate the > indexes and the delete the rows I > need.

Look into pg_dump and pg_restore. Using pg_dump with some clever options and perhaps editing the output before pg_restoring might do the trick.


Since you are doing "what if"-type analysis on the data, I wonder if might you be better off using views.

You could define a view for each scenario you want to test based on the negation of what you want to exclude. I.e., define a view based on what you want to INclude. E.g., if you want a "window" on the data where you "deleted" the rows where X=Y, then you would create a view as rows where (X != Y).

Views are stored in the database (in the System Catalog) as their defining query. Every time you query the view the database server looks up the underlying query that defines it and executes that (ANDed with any other conditions you used). There are several benefits to this approach:

  1. You never duplicate any portion of your data.
  2. The indexes already in use for the base table (your original, "real" table) will be used (as seen fit by the query optimizer) when you query each view/scenario. There is no need to redefine or copy them.
  3. Since a view is a "window" (NOT a shapshot) on the "real" data in the base table, you can add/update/delete on your base table and simply re-query the view scenarios with no need to recreate anything as the data changes over time.

There is a trade-off, of course. Since a view is a virtual table and not a "real" (base) table, you're actually executing a (perhaps complex) query every time you access it. This may slow things down a bit. But it may not. It depends on many issues (size and nature of the data, quality of the statistics in the System Catalog, speed of the hardware, usage load, and much more). You won't know until you try it. If (and only if) you actually find that the performance is unacceptably slow, then you might look at other options. (Materialized views, copies of tables, ... anything that trades space for time.)

Solution 5 - Sql

A simple way is include all:

CREATE TABLE new_table (LIKE original_table INCLUDING ALL);

Solution 6 - Sql

Create a new table using a select to grab the data you want. Then swap the old table with the new one.

create table mynewone as select * from myoldone where ...
mess (re-create) with indexes after the table swap.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAmandasaurusView Question on Stackoverflow
Solution 1 - Sqluser80168View Answer on Stackoverflow
Solution 2 - SqlWolfmanDragonView Answer on Stackoverflow
Solution 3 - SqloshaiView Answer on Stackoverflow
Solution 4 - SqlAlanView Answer on Stackoverflow
Solution 5 - SqlRingtailView Answer on Stackoverflow
Solution 6 - SqljimView Answer on Stackoverflow