5.2.2 PostgreSQL

Table of Contents

Understanding PostgreSQL in a Linux Server Context

PostgreSQL is a powerful, standards-compliant, open source relational database. In a Linux server role, it’s typically used as the primary data store for web applications, analytics systems, and internal business tools. This chapter focuses on practical PostgreSQL administration on Linux.

We’ll assume you already understand general database concepts from the parent “Database Servers” chapter and focus on PostgreSQL specifics: installation, basic configuration, security, and day‑to‑day operations.

Installing PostgreSQL on Major Distributions

PostgreSQL is available in most distributions’ repositories, but there are two common approaches:

Use the distribution’s packages (stable, often older).
Use the official PostgreSQL repositories (more current versions).

Debian/Ubuntu (APT)

On recent Debian/Ubuntu, the default postgresql package typically installs the latest version available in that distro.

sudo apt update
sudo apt install postgresql postgresql-contrib

Key points:

The service is usually named postgresql.
A system user postgres is created.
Cluster management is often done with pg_ctlcluster and pg_lsclusters on Debian-based systems.

To enable and check status:

sudo systemctl enable postgresql
sudo systemctl start postgresql
sudo systemctl status postgresql

For newer major versions than what the distribution provides, you can use the official PGDG (PostgreSQL Global Development Group) APT repository, which you’d configure according to PostgreSQL’s documentation.

RHEL/CentOS/Rocky/Alma/Fedora (DNF/YUM)

On RHEL-compatible systems, you often use either the AppStream modules or the official PGDG Yum repo.

Basic installation (Fedora or RHEL-family with AppStream):

sudo dnf install postgresql-server postgresql-contrib

Initialize the database cluster (if not done automatically by the package):

# RHEL/CentOS-style
sudo postgresql-setup --initdb
# Some variants
sudo /usr/bin/postgresql-setup initdb

Enable and start the service (service name may include version, e.g. postgresql-16 in PGDG packages):

sudo systemctl enable postgresql
sudo systemctl start postgresql
sudo systemctl status postgresql

Using PGDG (for more recent PostgreSQL):

Install PGDG repo package from postgresql.org.
Install specific version, e.g. postgresql16-server.
Initialize using the versioned init script or pg_ctl as instructed by the package.

Arch Linux (Pacman)

On Arch:

sudo pacman -S postgresql

Initialize the cluster (as postgres user):

sudo -iu postgres
initdb -D /var/lib/postgres/data
exit

Enable and start:

sudo systemctl enable postgresql
sudo systemctl start postgresql

PostgreSQL Cluster Layout and Basic Concepts

On a server, “one PostgreSQL installation” can host one or more clusters. A cluster is a running PostgreSQL instance with its own data directory, configuration, and listening port.

Typical locations:

Debian/Ubuntu: /var/lib/postgresql/<version>/main
RHEL family: /var/lib/pgsql/data or /var/lib/pgsql/<version>/data
Arch: /var/lib/postgres/data

Key directories/files in a data directory:

base/ – actual database files.
pg_wal/ (or older pg_xlog/) – write-ahead log (WAL) segments for crash recovery and replication.
postgresql.conf – main configuration.
pg_hba.conf – client authentication rules.
pg_ident.conf – user name mapping, rarely needed in simple setups.

You normally do not manipulate files inside the data directory directly; use SQL and administrative tools instead.

Working with the `postgres` Superuser

By default a PostgreSQL superuser named postgres is created. On Linux, there is usually also a matching system user.

Switch to the postgres system account and open a psql shell:

sudo -iu postgres
psql

Inside psql, the prompt typically looks like:

postgres=#

\q exits psql.

This postgres role has full privileges. For production, it’s better to:

Use postgres only for administration.
Create separate roles for applications and limited administration.

Basic `psql` Usage for Administrators

psql is the primary command-line client for PostgreSQL. Some essential commands from an admin perspective:

Connect to a database (as the current system user, if matching DB role exists):

psql                              # connects to 'postgres' or current user DB, depending on distro
psql -U myuser -d mydb -h 127.0.0.1

Useful meta-commands inside psql (note the leading backslash):

\l – list databases.
\c dbname – connect to another database.
\du – list roles.
\dt – list tables in the current schema.
\dn – list schemas.
\df – list functions.
\x – toggle expanded display.
\? – help on meta-commands.
\h – help on SQL commands, e.g. \h CREATE TABLE.

Run a single command from shell:

psql -U postgres -c "SELECT version();"

Creating Databases and Roles

PostgreSQL distinguishes between roles (users/groups) and databases.

Roles (Users and Groups)

Create a login role (user):

CREATE ROLE app_user LOGIN PASSWORD 'StrongPasswordHere';

Grant privileges:

GRANT CONNECT ON DATABASE myappdb TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;

Create a role that acts like a group (no login):

CREATE ROLE app_readonly;
GRANT app_readonly TO app_user;    -- app_user inherits app_readonly privileges

Use ALTER ROLE to adjust attributes, e.g.:

ALTER ROLE app_user SET search_path = public;

Databases

Create a database:

CREATE DATABASE myappdb OWNER app_user;

Drop a database (irreversible):

DROP DATABASE myappdb;

From the shell, you can also use the wrapper commands (for the default cluster):

createdb -O app_user myappdb
dropdb myappdb
createuser --pwprompt app_user

Core Configuration Files

PostgreSQL’s behavior is controlled primarily via:

postgresql.conf – server settings.
pg_hba.conf – client authentication.
pg_ident.conf – username mapping (optional).

These files reside in the data directory or in a config directory referenced by the service unit. Use:

SHOW config_file;
SHOW hba_file;

inside psql to see their paths.

`postgresql.conf` Basics

This file controls:

Networking: listen_addresses, port.
Memory: shared_buffers, work_mem, maintenance_work_mem.
Checkpoint behavior: checkpoint_timeout, max_wal_size.
Logging: logging_collector, log_directory, log_filename, log_line_prefix.

Example minimal network/logging settings snippet:

# postgresql.conf
listen_addresses = 'localhost'          # or '0.0.0.0' to accept connections on all IPs
port = 5432
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d.log'
log_line_prefix = '%m [%p] %u@%d '

After changes, reload (no restart needed for many settings):

sudo systemctl reload postgresql
# or for versioned service, e.g.
sudo systemctl reload postgresql-16

Alternatively:

SELECT pg_reload_conf();

Note that some parameters require a full restart to take effect.

`pg_hba.conf`: Client Authentication

pg_hba.conf controls who can connect, from where, to which database, and how they authenticate. Rules are checked top-to-bottom; the first match is used.

Common fields:

TYPE – local, host, hostssl, hostnossl.
DATABASE – database name, all, or patterns.
USER – role name, all, or patterns.
ADDRESS – IP/CIDR for host entries, e.g. 192.168.1.0/24.
METHOD – trust, reject, md5, scram-sha-256, peer, ident, etc.

Example:

# Allow local connections as Unix socket using peer auth
local   all             postgres                                peer
local   all             all                                     peer
# Allow LAN clients with password
host    all             all             192.168.1.0/24         md5
# Deny everything else explicitly (optional but good practice)
host    all             all             0.0.0.0/0              reject

On reconfiguration, reload:

sudo systemctl reload postgresql

Or from psql:

SELECT pg_reload_conf();

Enabling Remote Connections Securely

To permit remote connections:

Set listen_addresses in postgresql.conf:

   listen_addresses = '0.0.0.0'        # or a specific IP, e.g. '192.168.1.10'
   port = 5432

Configure firewall rules (covered in “Firewalls in depth” / “Basic System Security” chapters) to open port 5432 for appropriate networks.
Add appropriate rules to pg_hba.conf, e.g.:

   host    myappdb        app_user       203.0.113.50/32      md5
   host    all            all            10.0.0.0/24          scram-sha-256

Restart or reload as required:

   sudo systemctl reload postgresql

In production, you should use encrypted connections (SSL/TLS). Enabling SSL involves certificate configuration in postgresql.conf (ssl = on, ssl_cert_file, ssl_key_file) and possibly hostssl entries in pg_hba.conf.

Backup and Restore with `pg_dump` and `pg_restore`

PostgreSQL provides logical backup tools. (Snapshot-based backups and advanced strategies are handled in the “Backup and Restore” chapter; here we focus on PostgreSQL‑specific tools.)

Logical Backup with `pg_dump`

Dump a single database:

pg_dump -U postgres myappdb > /backups/myappdb.sql

Use custom format (recommended for large DBs or selective restore):

pg_dump -U postgres -Fc myappdb > /backups/myappdb.dump

Include roles and tablespaces (global objects) with pg_dumpall:

pg_dumpall -U postgres > /backups/all_databases.sql

Schedule these commands via cron or another scheduler for regular automatic backups.

Restore from Dump

For plain SQL dumps:

Create target database if needed:

   createdb -U postgres myappdb

Restore:

   psql -U postgres -d myappdb -f /backups/myappdb.sql

For custom-format dumps:

createdb -U postgres myappdb
pg_restore -U postgres -d myappdb /backups/myappdb.dump

You can list dump contents and restore selectively:

pg_restore -l /backups/myappdb.dump
pg_restore -U postgres -d myappdb -t mytable /backups/myappdb.dump

Basic Maintenance: Vacuum, Analyze, and Reindex

PostgreSQL uses MVCC (Multi-Version Concurrency Control), which means rows are not updated in place; this requires periodic maintenance to reclaim space and keep query plans accurate.

Autovacuum

Autovacuum runs automatically, but you should understand the basic operations:

VACUUM – cleans up dead tuples (reclaims space).
ANALYZE – updates statistics for the optimizer.
VACUUM ANALYZE – does both.

Manual examples:

VACUUM (VERBOSE) mytable;
ANALYZE mytable;
VACUUM FULL mytable;

VACUUM FULL locks the table and rewrites it, so use with caution on production systems.

Autovacuum is controlled in postgresql.conf with parameters like:

autovacuum = on
autovacuum_vacuum_scale_factor
autovacuum_analyze_scale_factor
autovacuum_max_workers

Adjusting these belongs to performance tuning, but as an admin you must at least confirm that autovacuum is enabled and not being disabled by mistake.

Reindexing

Bloat can occur in indexes. To rebuild:

REINDEX INDEX my_index;
REINDEX TABLE mytable;
REINDEX DATABASE mydb;

PostgreSQL also provides REINDEX CONCURRENTLY in modern versions to reduce locking, at the cost of longer runtime.

Monitoring and Basic Performance Checks

Day-to-day PostgreSQL administration involves checking performance and spotting problems early.

Useful Views and Tools

Some helpful built-in catalog views:

pg_stat_activity – current connections and running queries.
pg_stat_database – per-database statistics.
pg_stat_user_tables – statistics for user tables.
pg_locks – current locks.

Examples:

SELECT pid, usename, datname, state, query
FROM pg_stat_activity
ORDER BY query_start NULLS LAST
LIMIT 20;

To see blocking queries vs. blocked queries:

SELECT
  blocked.pid       AS blocked_pid,
  blocked.query     AS blocked_query,
  blocking.pid      AS blocking_pid,
  blocking.query    AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_locks blocked_locks
  ON blocked.pid = blocked_locks.pid
JOIN pg_locks blocking_locks
  ON blocked_locks.locktype = blocking_locks.locktype
 AND blocked_locks.database IS NOT DISTINCT FROM blocking_locks.database
 AND blocked_locks.relation IS NOT DISTINCT FROM blocking_locks.relation
 AND blocked_locks.page IS NOT DISTINCT FROM blocking_locks.page
 AND blocked_locks.tuple IS NOT DISTINCT FROM blocking_locks.tuple
 AND blocked_locks.virtualxid IS NOT DISTINCT FROM blocking_locks.virtualxid
 AND blocked_locks.transactionid IS NOT DISTINCT FROM blocking_locks.transactionid
 AND blocked_locks.classid IS NOT DISTINCT FROM blocking_locks.classid
 AND blocked_locks.objid IS NOT DISTINCT FROM blocking_locks.objid
 AND blocked_locks.objsubid IS NOT DISTINCT FROM blocking_locks.objsubid
 AND blocked_locks.mode = blocking_locks.mode
JOIN pg_stat_activity blocking
  ON blocking_locks.pid = blocking.pid
WHERE NOT blocked_locks.granted
  AND blocking_locks.granted;

For high-level stats per database:

SELECT datname,
       numbackends AS connections,
       xact_commit,
       xact_rollback,
       blks_read,
       blks_hit
FROM pg_stat_database;

These are useful for capacity planning and quick troubleshooting.

External Monitoring

In production environments, PostgreSQL is often monitored via:

pg_stat_statements extension for tracking slow and frequent queries.
Exporters for Prometheus/Grafana.
Third-party tools or managed services.

Enabling such extensions or integrating with external monitoring systems is part of broader monitoring strategy, but aware admins should at least know that pg_stat_statements typically requires:

# in postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
# then in psql:
CREATE EXTENSION pg_stat_statements;

Security Considerations for PostgreSQL

Beyond general Linux security practices and database hardening concepts, PostgreSQL has specific aspects:

Password Methods and Encryption

Common authentication methods in pg_hba.conf:

md5 – MD5-hashed password; widely supported but older.
scram-sha-256 – stronger password hashing; recommended where supported.
peer – checks OS user == DB role on local socket.
reject – explicitly deny.

To enable SCRAM for new passwords, set:

password_encryption = 'scram-sha-256'

in postgresql.conf, then change user passwords:

ALTER ROLE app_user PASSWORD 'NewStrongPassword';

Limiting Privileges

Some best practices specific to PostgreSQL:

Do not grant SUPERUSER to application accounts.
Use REVOKE on default public schema privileges if appropriate.
Create schemas per application or per logical area.
Limit CREATE privileges to trusted roles.

Examples:

REVOKE CREATE ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON DATABASE myappdb FROM PUBLIC;

Row-Level Security (RLS)

PostgreSQL supports row-level security to restrict which rows a role can see or modify. Defining RLS policies is an advanced topic that intersects with application design, but an admin might enable it:

ALTER TABLE sensitive_table ENABLE ROW LEVEL SECURITY;
CREATE POLICY p1 ON sensitive_table
  FOR SELECT TO app_user
  USING (owner_id = current_setting('myapp.current_user_id')::int);

RLS is powerful but requires careful planning with developers.

Major Version Upgrades

PostgreSQL treats major versions separately (e.g. 13 → 14). Databases must be migrated; you can’t just point a new PostgreSQL binary at an old data directory.

Common approaches:

pg_upgrade – in-place migration that reuses data files when possible, very fast.
Logical dump/restore – using pg_dump and pg_restore between versions.
Replication-based upgrades – using logical replication or pglogical.

A typical pg_upgrade flow (high-level):

Install new PostgreSQL version alongside the old one.
Stop old cluster.
Run pg_upgrade pointing from old data directory to new.
Analyze and vacuum as recommended.
Switch services to the new version.

Because upgrades are disruptive and have many details, they are usually rehearsed on staging environments.

Simple Replication Overview

While full replication/HA design is covered elsewhere, PostgreSQL’s built-in replication is central to serious deployments.

Concepts:

Primary (or master) node – accepts writes.
Standby (or replica) nodes – replay WAL from primary; usually read-only.
Synchronous vs. asynchronous replication.
Physical vs. logical replication.

A minimal physical streaming replication setup involves:

Configuring wal_level = replica (or higher) on primary.
Setting max_wal_senders, max_replication_slots, etc.
Allowing replication connections in pg_hba.conf:

   host    replication     repl_user    10.0.0.2/32   md5

Cloning the standby from the primary’s data via pg_basebackup.
Configuring primary_conninfo (e.g. in standby.signal/recovery config).

Full HA setups also incorporate failover management tools, virtual IPs, or load balancers.

Common Administrative Tasks Summary

Typical daily/weekly tasks for a PostgreSQL administrator on Linux include:

Checking service health:

  systemctl status postgresql
  journalctl -u postgresql

Monitoring connections and slow queries via pg_stat_activity, logs, and pg_stat_statements.
Managing users/roles and databases (CREATE ROLE, CREATE DATABASE).
Reviewing and tuning postgresql.conf in conjunction with system monitoring.
Ensuring regular pg_dump/pg_dumpall backups run and are tested.
Investigating locks, bloat, or unusual growth using system catalogs.
Ensuring pg_hba.conf remains strict and correct, especially after network changes.
Planning and executing version upgrades with minimal downtime.

This chapter gives you the core operational knowledge to run PostgreSQL as a Linux server administrator; deeper performance tuning, high availability, and advanced backup strategies build on these fundamentals.

Comments

Please login to add a comment.

Don't have an account? Register now!