Starting and populating a Postgres container in Docker

DatabasePostgresqlDocker

Database Problem Overview


I have a Docker container that contains my Postgres database. It's using the official Postgres image which has a CMD entry that starts the server on the main thread.

I want to populate the database by running RUN psql –U postgres postgres < /dump/dump.sql before it starts listening to queries.

I don't understand how this is possible with Docker. If I place the RUN command after CMD, it will of course never be reached because Docker has finished reading the Dockerfile. But if I place it before the CMD, it will run before psql even exists as a process.

How can I prepopulate a Postgres database in Docker?

Database Solutions


Solution 1 - Database

After a lot of fighting, I have found a solution ;-)

For me was very useful a comment posted here: https://registry.hub.docker.com/_/postgres/ from "justfalter"

Anyway, I have done in this way:

# Dockerfile
FROM postgres:9.4

RUN mkdir -p /tmp/psql_data/

COPY db/structure.sql /tmp/psql_data/
COPY scripts/init_docker_postgres.sh /docker-entrypoint-initdb.d/

db/structure.sql is a sql dump, useful to initialize the first tablespace.

Then, the init_docker_postgres.sh

#!/bin/bash

# this script is run when the docker container is built
# it imports the base database structure and create the database for the tests

DATABASE_NAME="db_name"
DB_DUMP_LOCATION="/tmp/psql_data/structure.sql"

echo "*** CREATING DATABASE ***"

# create default database
gosu postgres postgres --single <<EOSQL
  CREATE DATABASE "$DATABASE_NAME";
  GRANT ALL PRIVILEGES ON DATABASE "$DATABASE_NAME" TO postgres;
EOSQL

# clean sql_dump - because I want to have a one-line command

# remove indentation
sed "s/^[ \t]*//" -i "$DB_DUMP_LOCATION"

# remove comments
sed '/^--/ d' -i "$DB_DUMP_LOCATION"

# remove new lines
sed ':a;N;$!ba;s/\n/ /g' -i "$DB_DUMP_LOCATION"

# remove other spaces
sed 's/  */ /g' -i "$DB_DUMP_LOCATION"

# remove firsts line spaces
sed 's/^ *//' -i "$DB_DUMP_LOCATION"

# append new line at the end (suggested by @Nicola Ferraro)
sed -e '$a\' -i "$DB_DUMP_LOCATION"

# import sql_dump
gosu postgres postgres --single "$DATABASE_NAME" < "$DB_DUMP_LOCATION";


echo "*** DATABASE CREATED! ***"

So finally:

# no postgres is running
[myserver]# psql -h 127.0.0.1 -U postgres
psql: could not connect to server: Connection refused
	Is the server running on host "127.0.0.1" and accepting
	TCP/IP connections on port 5432?

[myserver]# docker build -t custom_psql .
[myserver]# docker run -d --name custom_psql_running -p 5432:5432 custom_psql

[myserver]# docker ps -a
CONTAINER ID        IMAGE                COMMAND                CREATED             STATUS              PORTS                    NAMES
ce4212697372        custom_psql:latest   "/docker-entrypoint.   9 minutes ago       Up 9 minutes        0.0.0.0:5432->5432/tcp   custom_psql_running

[myserver]# psql -h 127.0.0.1 -U postgres
psql (9.2.10, server 9.4.1)
WARNING: psql version 9.2, server version 9.4.
         Some psql features might not work.
Type "help" for help.

postgres=# 

# postgres is now initialized with the dump

Hope it helps!

Solution 2 - Database

For those who want to initialize a PostgreSQL DB with millions of records during the first run.

Import using *.sql dump

You can do simple sql dump and copy the dump.sql file into /docker-entrypoint-initdb.d/. The problem is speed. My dump.sql script is about 17MB (small DB - 10 tables with 100k rows in only one of them) and the initialization takes over a minute (!). That is unacceptable for local development / unit test, etc.

Import using binary dump

The solution is to make a binary PostgreSQL dump and use shell scripts initialization support. Then the same DB is initialized in about 500ms instead of 1 minute.

1. Create the dump.pgdata binary dump of a DB named "my-db" directly from within a container or your local DB

pg_dump -U postgres --format custom my-db > "dump.pgdata"

Or from host from running container (postgres-container)

docker exec postgres-container pg_dump -U postgres --format custom my-db > "dump.pgdata"

2. Create a Docker image with a given dump and initialization script

$ tree
.
├── Dockerfile
└── docker-entrypoint-initdb.d
    ├── 01-restore.sh
    ├── 02-small-updates.sql
    └── dump.pgdata
$ cat Dockerfile
FROM postgres:11

COPY ./docker-entrypoint-initdb.d/ /docker-entrypoint-initdb.d/
$ cat docker-entrypoint-initdb.d/01-restore.sh
#!/bin/bash

file="/docker-entrypoint-initdb.d/dump.pgdata"
dbname=my-db

echo "Restoring DB using $file"
pg_restore -U postgres --dbname=$dbname --verbose --single-transaction < "$file" || exit 1
$ cat docker-entrypoint-initdb.d/02-small-updates.sql
-- some updates on your DB, for example for next application version
-- this file will be executed on DB during next release
UPDATE ... ;

3. Build an image and run it

$ docker build -t db-test-img .
$ docker run -it --rm --name db-test db-test-img

Solution 3 - Database

Alternatively, you can just mount a volume to /docker-entrypoint-initdb.d/ that contains all your DDL scripts. You can put in ***.sh, .sql, or .sql.gz files and it will take care of executing those on start-up.

e.g. (assuming you have your scripts in /tmp/my_scripts)

docker run -v /tmp/my_scripts:/docker-entrypoint-initdb.d postgres

Solution 4 - Database

There is yet another option available that utilises Flocker:

> Flocker is a container data volume manager that is designed to allow databases like PostgreSQL to easily run in containers in production. When running a database in production, you have to think about things like recovering from host failure. Flocker provides tools for managing data volumes across a cluster of machines like you have in a production environment. For example, as a Postgres container is scheduled between hosts in response to server failure, Flocker can automatically move its associated data volume between hosts at the same time. This means that when your Postgres container starts up on a new host, it has its data. This operation can be accomplished manually using the Flocker API or CLI, or automatically by a container orchestration tool that Flocker is integrates with, for example Docker Swarm, Kubernetes or Mesos.

Solution 5 - Database

I Followed the same solution which @damoiser , The only situation which was different was I wanted to import all dump data.

Please follow the solution below.(I have not done any kind of checks)

Dockerfile

FROM postgres:9.5

RUN mkdir -p /tmp/psql_data/

COPY db/structure.sql /tmp/psql_data/
COPY scripts/init_docker_postgres.sh /docker-entrypoint-initdb.d/

then the init_docker_postgres.sh script

#!/bin/bash

DB_DUMP_LOCATION="/tmp/psql_data/structure.sql"

echo "*** CREATING DATABASE ***"

psql -U postgres < "$DB_DUMP_LOCATION";

echo "*** DATABASE CREATED! ***"

and then you can build your image as

docker build -t abhije***/postgres-data .

docker run -d abhije***/postgres-data 

Solution 6 - Database

My solution is inspired by Alex Dguez's answer which unfortunately doesn't work for me because:

  1. I used pg-9.6 base image, and the RUN /docker-entrypoint.sh --help never ran through for me, which always complained with The command '/bin/sh -c /docker-entrypoint.sh -' returned a non-zero code: 1
  2. I don't want to pollute the /docker-entrypoint-initdb.d dir

The following answer is originally from my reply in another post: https://stackoverflow.com/a/59303962/4440427. It should be noted that the solution is for restoring from a binary dump instead of from a plain SQL as asked by the OP. But it can be modified slightly to adapt to the plain SQL case

Dockerfile:

FROM postgres:9.6.16-alpine

LABEL maintainer="[email protected]"
LABEL org="Cobrainer GmbH"

ARG PG_POSTGRES_PWD=postgres
ARG DBUSER=someuser
ARG DBUSER_PWD=P@ssw0rd
ARG DBNAME=sampledb
ARG DB_DUMP_FILE=example.pg

ENV POSTGRES_DB launchpad
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD ${PG_POSTGRES_PWD}
ENV PGDATA /pgdata

COPY wait-for-pg-isready.sh /tmp/wait-for-pg-isready.sh
COPY ${DB_DUMP_FILE} /tmp/pgdump.pg

RUN set -e && \
    nohup bash -c "docker-entrypoint.sh postgres &" && \
    /tmp/wait-for-pg-isready.sh && \
    psql -U postgres -c "CREATE USER ${DBUSER} WITH SUPERUSER CREATEDB CREATEROLE ENCRYPTED PASSWORD '${DBUSER_PWD}';" && \
    psql -U ${DBUSER} -d ${POSTGRES_DB} -c "CREATE DATABASE ${DBNAME} TEMPLATE template0;" && \
    pg_restore -v --no-owner --role=${DBUSER} --exit-on-error -U ${DBUSER} -d ${DBNAME} /tmp/pgdump.pg && \
    psql -U postgres -c "ALTER USER ${DBUSER} WITH NOSUPERUSER;" && \
    rm -rf /tmp/pgdump.pg

HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
  CMD pg_isready -U postgres -d launchpad

where the wait-for-pg-isready.sh is:

#!/bin/bash
set -e

get_non_lo_ip() {
  local _ip _non_lo_ip _line _nl=$'\n'
  while IFS=$': \t' read -a _line ;do
    [ -z "${_line%inet}" ] &&
        _ip=${_line[${#_line[1]}>4?1:2]} &&
        [ "${_ip#127.0.0.1}" ] && _non_lo_ip=$_ip
    done< <(LANG=C /sbin/ifconfig)
  printf ${1+-v} $1 "%s${_nl:0:$[${#1}>0?0:1]}" $_non_lo_ip
}

get_non_lo_ip NON_LO_IP
until pg_isready -h $NON_LO_IP -U "postgres" -d "launchpad"; do
  >&2 echo "Postgres is not ready - sleeping..."
  sleep 4
done

>&2 echo "Postgres is up - you can execute commands now"

The above scripts together with a more detailed README are available at https://github.com/cobrainer/pg-docker-with-restored-db

Solution 7 - Database

I was able to load the data in by pre-pending the run command in the docker file with /etc/init.d/postgresql. My docker file has the following line which is working for me:

RUN /etc/init.d/postgresql start && /usr/bin/psql -a < /tmp/dump.sql

Solution 8 - Database

We for E2E test in which we need a database with structure and data already saved in the Docker image we have done the following:

Dockerfile:

FROM postgres:9.4.24-alpine
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD postgres
ENV PGDATA /pgdata
COPY database.backup /tmp/
COPY database_restore.sh /docker-entrypoint-initdb.d/
RUN /docker-entrypoint.sh --help
RUN rm -rf /docker-entrypoint-initdb.d/database_restore.sh
RUN rm -rf /tmp/database.backup

database_restore.sh:

#!/bin/sh
set -e 
pg_restore -C -d postgres /tmp/database.backup

To create the image:

docker build .

To start the container:

docker run --name docker-postgres -d -p 5432:5432 <Id-docker-image>

This does not restore the database every time the container is booted. The structure and data of the database is already contained in the created Docker image.

We have based on this article, but eliminating the multistage: Creating Fast, Lightweight Testing Databases in Docker

> Edit: With version 9.4-alpine does not work now because it does not > run the database_restore.sh scrips. Use version 9.4.24-alpine

Solution 9 - Database

My goal was to have an image that contains the database - i. e. saving the time to rebuild it everytime I do docker run oder docker-compose up.

We would just have to manage to get the line exec "$@" out of docker-entrypoint.sh. So I added into my Dockerfile:

#Copy my ssql scripts into the image to /docker-entrypoint-initdb.d:
COPY ./init_db /docker-entrypoint-initdb.d

#init db
RUN grep -v 'exec "$@"' /usr/local/bin/docker-entrypoint.sh > /tmp/docker-entrypoint-without-serverstart.sh && \
    chmod a+x /tmp/docker-entrypoint-without-serverstart.sh && \
    /tmp/docker-entrypoint-without-serverstart.sh postgres && \
    rm -rf /docker-entrypoint-initdb.d/* /tmp/docker-entrypoint-without-serverstart.sh

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMigwellView Question on Stackoverflow
Solution 1 - DatabasedamoiserView Answer on Stackoverflow
Solution 2 - DatabasePetr ÚjezdskýView Answer on Stackoverflow
Solution 3 - DatabasedarthbinamiraView Answer on Stackoverflow
Solution 4 - DatabaseMax DesiatovView Answer on Stackoverflow
Solution 5 - DatabaseAbhijeet KambleView Answer on Stackoverflow
Solution 6 - DatabaseLu LiuView Answer on Stackoverflow
Solution 7 - DatabaseacknappView Answer on Stackoverflow
Solution 8 - DatabaseAlex DguezView Answer on Stackoverflow
Solution 9 - Databasefjf2002View Answer on Stackoverflow