What's a PostgreSQL "Cluster" and how do I create one?

SqlDatabasePostgresqlPostgresql 9.1

Sql Problem Overview


I am very new to databases, I haven't worked lot on it. Now I want to understand the term database clusters. I googled a lot and found many useful links but I am not able to understand them - maybe because I have very little basic knowledge about databases and also they were in very techy language.

I need advice on these points:

  1. What are database clusters in PostgreSQL?
  2. How to create clusters in PostgreSQL?

Sql Solutions


Solution 1 - Sql

A PostgreSQL database "cluster" is a postmaster and a group of subsiduary processes, all managing a shared data directory that contains one or more databases.

The term "cluster" in PostgreSQL is a historical quirk*, and is completely different to the general meaning of "compute cluster", which normally refers to groups of computers that work together to achieve higher performance and/or availability. It is also un-related to the PostgreSQL command CLUSTER, which is about organizing tables.


If you're reading this you might actually be looking for information on high availability, replication or pooling, in which case you should read the Replication, Clustering and High Availability wiki article and the high availability section of the PostgreSQL manual, then look into tools like repmgr.


A cluster is normally created for you when you install PostgreSQL; the installation will usually initdb a new cluster for you. It is quite unusual for a basic or intermediate user to ever need to create clusters or manage multiple clusters, so it would help if you explained why you want to do this, and what the underlying problem you are trying to solve is. The user manual could probably explain this better, since it assumes you're installing PostgreSQL from source and relatively few people actually do that.

Each cluster's data directory is created with initdb and managed with a postmaster that's started via a system service (Windows service, launchd, init, upstart, systemd, etc depending on operating system and version) or directly via pg_ctl.

The cluster has built-in databases template0, template1 and postgres; other databases are created by the user.

The postmaster for a cluster accepts incoming connections by listening on a tcp port, and hands those off to worker backends. Only one postmaster may run on a given port, so each cluster must have a different port.

I wrote more about PostgreSQL's structure in this previous answer. See the sub-heading "Relations? Schema? Huh?".

How to "create" clusters in Pg depends entirely on how you are running it. Since you're asking, I suspect you're on an Ubuntu system that uses pg_wrapper, in which case you'd use the pg_wrapper commands like pg_createcluster.


* The confusion between a "cluster" in PostgreSQL terminology and the common usage of the term "cluster" is a confusing and regrettable historical oddity, especially when discussing clustering of PostgreSQL instances. You can have a cluster of PostgreSQL clusters, which is just painful.

Solution 2 - Sql

This answer might be quite late, but it might help someone who is a beginner.

What is a cluster in most basic sense:

In most basic terms, a postgres cluster as a group of databases which have their own configurations. For example you might have cluster which uses postgres v9 and has 2 databases in it, and all databases will use the same configuration offered by the cluster e.g buffer size, number of connections allowed, connection pool size etc. Similarly you can have another cluster which uses postgres 12 and it also can have multiple databases in it. You can also have multiple clusters with the same version but different configurations.

The commands below are tested on ubuntu only, these might not work for other OS.

To check how many clusters you have you can run the command

pg_lsclusters

This would give you a list of clusters with their status, port, names, location of data directory etc. The status tells if this cluster is online or not. You cant connect to an offline cluster.

To create a new cluster, run this command

initdb -D /usr/local/pgsql/data

This tells postgres to initialize a new database and where to create the data directory. Ofcourse the user should have permission to create this directory. Also this would create default configurations which is usually located in /var/lib/postgresql/version/clusterName.

To connect to a cluster use this command

psql -U postgres -p 5436 -h localhost

Each cluster will have unique port number, so make sure you select the correct port.

You can also start, stop or check status of cluster

pg_ctlcluster 12 main stop

Here 12 is the postgres version and main is the name of the cluster.

Creating a new database in cluster

To create a new database, you need to first connect to the cluster (using the command mentioned above). And then run this command.

CREATE DATABASE mynewdb;

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSunil KumarView Question on Stackoverflow
Solution 1 - SqlCraig RingerView Answer on Stackoverflow
Solution 2 - SqlOmer FarooqView Answer on Stackoverflow