How to import CSV file data into a PostgreSQL table

PostgresqlCsvPostgresql Copy

Postgresql Problem Overview


How can I write a stored procedure that imports data from a CSV file and populates the table?

Postgresql Solutions


Solution 1 - Postgresql

Take a look at this short article.


The solution is paraphrased here:

Create your table:

CREATE TABLE zip_codes
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision,
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

Copy data from your CSV file to the table:

COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' WITH (FORMAT csv);

Solution 2 - Postgresql

If you don't have permission to use COPY (which work on the db server), you can use \copy instead (which works in the db client). Using the same example as Bozhidar Batsov:

Create your table:

CREATE TABLE zip_codes
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision,
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

Copy data from your CSV file to the table:

\copy zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

Mind that \copy ... must be written in one line and without a ; at the end!

You can also specify the columns to read:

\copy zip_codes(ZIP,CITY,STATE) FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

See the documentation for COPY:

> Do not confuse COPY with the psql instruction \copy. \copy invokes COPY FROM STDIN or COPY TO STDOUT, and then fetches/stores the data in a file accessible to the psql client. Thus, file accessibility and access rights depend on the client rather than the server when \copy is used.

And note:

> For identity columns, the COPY FROM command will always write the column values provided in the input data, like the INSERT option OVERRIDING SYSTEM VALUE.

Solution 3 - Postgresql

One quick way of doing this is with the Python Pandas library (version 0.15 or above works best). This will handle creating the columns for you - although obviously the choices it makes for data types might not be what you want. If it doesn't quite do what you want you can always use the 'create table' code generated as a template.

Here's a simple example:

import pandas as pd
df = pd.read_csv('mypath.csv')
df.columns = [c.lower() for c in df.columns] # PostgreSQL doesn't like capitals or spaces

from sqlalchemy import create_engine
engine = create_engine('postgresql://username:password@localhost:5432/dbname')

df.to_sql("my_table_name", engine)

And here's some code that shows you how to set various options:

# Set it so the raw SQL output is logged
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

df.to_sql("my_table_name2",
          engine,
          if_exists="append",  # Options are ‘fail’, ‘replace’, ‘append’, default ‘fail’
          index = False, # Do not output the index of the dataframe
          dtype = {'col1': sqlalchemy.types.NUMERIC,
                   'col2': sqlalchemy.types.String}) # Datatypes should be SQLAlchemy types

Solution 4 - Postgresql

Most other solutions here require that you create the table in advance/manually. This may not be practical in some cases (e.g., if you have a lot of columns in the destination table). So, the approach below may come handy.

Providing the path and column count of your CSV file, you can use the following function to load your table to a temp table that will be named as target_table:

The top row is assumed to have the column names.

create or replace function data.load_csv_file
(
    target_table text,
    csv_path text,
    col_count integer
)

returns void as $$

declare

iter integer; -- dummy integer to iterate columns with
col text; -- variable to keep the column name at each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);

    iter := 1;
    col_first := (select col_1 from temp_table limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row
    execute format('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length(target_table) > 0 then
        execute format('alter table temp_table rename to %I', target_table);
    end if;

end;

$$ language plpgsql;

Solution 5 - Postgresql

You could also use pgAdmin, which offers a GUI to do the import. That's shown in this SO thread. The advantage of using pgAdmin is that it also works for remote databases.

Much like the previous solutions though, you would need to have your table on the database already. Each person has his own solution, but I usually open the CSV file in Excel, copy the headers, paste special with transposition on a different worksheet, place the corresponding data type on the next column, and then just copy and paste that to a text editor together with the appropriate SQL table creation query like so:

CREATE TABLE my_table (
    /* Paste data from Excel here for example ... */
    col_1 bigint,
    col_2 bigint,
    /* ... */
    col_n bigint
)

Solution 6 - Postgresql

COPY table_name FROM 'path/to/data.csv' DELIMITER ',' CSV HEADER;

Solution 7 - Postgresql

As Paul mentioned, import works in pgAdmin:

Right-click on table → Import

Select a local file, format and coding.

Here is a German pgAdmin GUI screenshot:

pgAdmin import GUI

A similar thing you can do with DbVisualizer (I have a license and am not sure about free version).

Right-click on a table → Import Table Data...

DbVisualizer import GUI

Solution 8 - Postgresql

  1. Create a table first

  2. Then use the copy command to copy the table details:

     copy table_name (C1,C2,C3....)
     from 'path to your CSV file' delimiter ',' csv header;
    

NOTE:

  • columns and order are specified by C1,C2,C3.. in SQL
  • The header option just skips one line from the input, not according to columns' name.

Solution 9 - Postgresql

Use this SQL code:

copy table_name(atribute1,attribute2,attribute3...)
from 'E:\test.csv' delimiter ',' csv header

The header keyword lets the DBMS know that the CSV file have a header with attributes.

For more, visit Import CSV File Into PostgreSQL Table.

Solution 10 - Postgresql

This is a personal experience with PostgreSQL, and I am still waiting for a faster way.

  1. Create a table skeleton first if the file is stored locally:

     drop table if exists ur_table;
     CREATE TABLE ur_table
     (
         id serial NOT NULL,
         log_id numeric,
         proc_code numeric,
         date timestamp,
         qty    int,
         name varchar,
         price money
     );
     COPY
         ur_table(id, log_id, proc_code, date, qty, name, price)
     FROM '\path\xxx.csv' DELIMITER ',' CSV HEADER;
    
  2. When the \path\xxx.csv file is on the server, PostgreSQL doesn't have the permission to access the server. You will have to import the .csv file through the pgAdmin built in functionality.

    Right click the table name and choose import.

    Enter image description here

If you still have the problem, please refer this tutorial: Import CSV File Into PostgreSQL Table

Solution 11 - Postgresql

How to import CSV file data into a PostgreSQL table

Steps:

  1. Need to connect a PostgreSQL database in the terminal

     psql -U postgres -h localhost
    
  2. Need to create a database

     create database mydb;
    
  3. Need to create a user

     create user siva with password 'mypass';
    
  4. Connect with the database

     \c mydb;
    
  5. Need to create a schema

     create schema trip;
    
  6. Need to create a table

     create table trip.test(VendorID int,passenger_count int,trip_distance decimal,RatecodeID int,store_and_fwd_flag varchar,PULocationID int,DOLocationID int,payment_type decimal,fare_amount decimal,extra decimal,mta_tax decimal,tip_amount decimal,tolls_amount int,improvement_surcharge decimal,total_amount
     );
    
  7. Import csv file data to postgresql

     COPY trip.test(VendorID int,passenger_count int,trip_distance decimal,RatecodeID int,store_and_fwd_flag varchar,PULocationID int,DOLocationID int,payment_type decimal,fare_amount decimal,extra decimal,mta_tax decimal,tip_amount decimal,tolls_amount int,improvement_surcharge decimal,total_amount) FROM '/home/Documents/trip.csv' DELIMITER ',' CSV HEADER;
    
  8. Find the given table data

     select * from trip.test;
    

Solution 12 - Postgresql

IMHO, the most convenient way is to follow "Import CSV data into postgresql, the comfortable way ;-)", using csvsql from csvkit, which is a Python package installable via pip.

Solution 13 - Postgresql

You can also use pgfutter, or, even better, pgcsv.

These tools create the table columns from you, based on the CSV header.

pgfutter is quite buggy, and I'd recommend pgcsv.

Here's how to do it with pgcsv:

sudo pip install pgcsv
pgcsv --db 'postgresql://localhost/postgres?user=postgres&password=...' my_table my_file.csv

Solution 14 - Postgresql

In Python, you can use this code for automatic PostgreSQL table creation with column names:

import pandas, csv

from io import StringIO
from sqlalchemy import create_engine

def psql_insert_copy(table, conn, keys, data_iter):
    dbapi_conn = conn.connection
    with dbapi_conn.cursor() as cur:
        s_buf = StringIO()
        writer = csv.writer(s_buf)
        writer.writerows(data_iter)
        s_buf.seek(0)
        columns = ', '.join('"{}"'.format(k) for k in keys)
        if table.schema:
            table_name = '{}.{}'.format(table.schema, table.name)
        else:
            table_name = table.name
        sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(table_name, columns)
        cur.copy_expert(sql=sql, file=s_buf)

engine = create_engine('postgresql://user:password@localhost:5432/my_db')

df = pandas.read_csv("my.csv")
df.to_sql('my_table', engine, schema='my_schema', method=psql_insert_copy)

It's also relatively fast. I can import more than 3.3 million rows in about 4 minutes.

Solution 15 - Postgresql

You can use the Pandas library if the file is not very large.

Be careful when using iter over Pandas dataframes. I am doing this here to demonstrate the possibility. One could also consider the pd.Dataframe.to_sql() function when copying from a dataframe to an SQL table.

Assuming you have created the table you want, you could:

import psycopg2
import pandas as pd
data=pd.read_csv(r'path\to\file.csv', delimiter=' ')

#prepare your data and keep only relevant columns

data.drop(['col2', 'col4','col5'], axis=1, inplace=True)
data.dropna(inplace=True)
print(data.iloc[:3])


conn=psycopg2.connect("dbname=db user=postgres password=password")
cur=conn.cursor()

for index,row in data.iterrows():
      cur.execute('''insert into table (col1,col3,col6)
    VALUES (%s,%s,%s)''', (row['col1'], row['col3'], row['col6'])

cur.close()
conn.commit()

conn.close()
print('\n db connection closed.')

Solution 16 - Postgresql

Create a table and have the required columns that are used for creating a table in the CSV file.

  1. Open postgres and right click on the target table which you want to load. Select import and Update the following steps in the file options section

  2. Now browse your file for the filename

  3. Select CSV in format

  4. Encoding as ISO_8859_5

Now go to Misc. options. Check header and click on import.

Solution 17 - Postgresql

If you need a simple mechanism to import from text/parse multiline CSV content, you could use:

CREATE TABLE t   -- OR INSERT INTO tab(col_names)
AS
SELECT
   t.f[1] AS col1
  ,t.f[2]::int AS col2
  ,t.f[3]::date AS col3
  ,t.f[4] AS col4
FROM (
  SELECT regexp_split_to_array(l, ',') AS f
  FROM regexp_split_to_table(
$$a,1,2016-01-01,bbb
c,2,2018-01-01,ddd
e,3,2019-01-01,eee$$, '\n') AS l) t;

DBFiddle Demo

Solution 18 - Postgresql

I created a small tool that imports csv file into PostgreSQL super easy. It is just a command and it will create and populate the tables, but unfortunately, at the moment, all fields automatically created uses the type TEXT:

csv2pg users.csv -d ";" -H 192.168.99.100 -U postgres -B mydatabase

The tool can be found on https://github.com/eduardonunesp/csv2pg

Solution 19 - Postgresql

DBeaver Community Edition (dbeaver.io) makes it trivial to connect to a database, then import a CSV file for upload to a PostgreSQL database. It also makes it easy to issue queries, retrieve data, and download result sets to CSV, JSON, SQL, or other common data formats.

It is a FOSS multi-platform database tool for SQL programmers, DBAs and analysts that supports all popular databases: MySQL, PostgreSQL, SQLite, Oracle, DB2, SQL Server, Sybase, MS Access, Teradata, Firebird, Hive, Presto, etc. It's a viable FOSS competitor to TOAD for Postgres, TOAD for SQL Server, or Toad for Oracle.

I have no affiliation with DBeaver. I love the price (FREE!) and full functionality, but I wish they would open up this DBeaver/Eclipse application more and make it easy to add analytics widgets to DBeaver / Eclipse, rather than requiring users to pay for the $199 annual subscription just to create graphs and charts directly within the application. My Java coding skills are rusty and I don't feel like taking weeks to relearn how to build Eclipse widgets, (only to find that DBeaver has probably disabled the ability to add third-party widgets to the DBeaver Community Edition.)

Solution 20 - Postgresql

You can create a Bash file as import.sh (that your CSV format is a tab delimiter):

#!/usr/bin/env bash

USER="test"
DB="postgres"
TBALE_NAME="user"
CSV_DIR="$(pwd)/csv"
FILE_NAME="user.txt"

echo $(psql -d $DB -U $USER  -c "\copy $TBALE_NAME from '$CSV_DIR/$FILE_NAME' DELIMITER E'\t' csv" 2>&1 |tee /dev/tty)

And then run this script.

Solution 21 - Postgresql

You have 3 options to import CSV files to PostgreSQL: First, using the COPY command through the command line.

enter image description here

Second, using the pgAdmin tool’s import/export.

enter image description here

Third, using a cloud solution like Skyvia which gets the CSV file from an online location like an FTP source or a cloud storage like Google Drive.

enter image description here

You can check out the article that explains all of these from here.

Solution 22 - Postgresql

By using any client—I used DataGrip—I created a new database and then within the default schema (public) of the database, right-click the database and then do Import Data from file.

Choose the CSV file from the location and then choose Import FileFormats as TSV → ensure each column name of the data CSV file contributes to the column name of tables.

Import CSV file

Solution 23 - Postgresql

My idea is to convert your CSV file into SQL queries:

  1. Open tool Convert CSV to Insert SQL Online
  2. Paste or upload your CSV file in the Data Source pane
  3. Scroll to the Table Generator panel
  4. Click Copy to clipboard or Download

Example:

id,name
1,Roberta
2,Oliver

The output of SQL queries:

CREATE TABLE tableName
(
    id    varchar(300),
    name  varchar(300)
);

INSERT INTO tableName (id,name)
VALUES
    ('1', 'Roberta'),
    ('2', 'Oliver');

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionvardhanView Question on Stackoverflow
Solution 1 - PostgresqlBozhidar BatsovView Answer on Stackoverflow
Solution 2 - PostgresqlbjelliView Answer on Stackoverflow
Solution 3 - PostgresqlRobinLView Answer on Stackoverflow
Solution 4 - PostgresqlmehmetView Answer on Stackoverflow
Solution 5 - PostgresqlPaulView Answer on Stackoverflow
Solution 6 - PostgresqltimxorView Answer on Stackoverflow
Solution 7 - PostgresqlAndreas L.View Answer on Stackoverflow
Solution 8 - Postgresqluser9130085View Answer on Stackoverflow
Solution 9 - PostgresqldjdereView Answer on Stackoverflow
Solution 10 - PostgresqlfloweraView Answer on Stackoverflow
Solution 11 - PostgresqlsivamaniView Answer on Stackoverflow
Solution 12 - PostgresqlsalView Answer on Stackoverflow
Solution 13 - PostgresqlVlad DinulescuView Answer on Stackoverflow
Solution 14 - PostgresqlMarcView Answer on Stackoverflow
Solution 15 - PostgresqlBuffcoderView Answer on Stackoverflow
Solution 16 - PostgresqlsurirulerView Answer on Stackoverflow
Solution 17 - PostgresqlLukasz SzozdaView Answer on Stackoverflow
Solution 18 - PostgresqlEduardo PereiraView Answer on Stackoverflow
Solution 19 - PostgresqlRich Lysakowski PhDView Answer on Stackoverflow
Solution 20 - PostgresqlMohammad FallahView Answer on Stackoverflow
Solution 21 - PostgresqlDewa ManandharView Answer on Stackoverflow
Solution 22 - PostgresqlSandyView Answer on Stackoverflow
Solution 23 - PostgresqlRaosView Answer on Stackoverflow