PostgreSQL Backup Guide: pg_dump vs pg_dumpall

Tuesday, 31 March 2026

PostgreSQL Backup Guide: pg_dump vs pg_dumpall

If you have ever been in a production outage where a single schema was corrupted or a developer dropped the wrong table, you already know one thing ; your backup strategy is only as good as your restore flexibility.

In PostgreSQL, logical backups using pg_dump and pg_dumpall are your first line of defense for granular recovery. Unlike physical backups, they give you the ability to restore specific objects, migrate databases across environments, and even troubleshoot data inconsistencies without touching the entire cluster.

Over the years, I have seen teams rely heavily on physical backups and completely overlook logical ones - until they need to recover just one table and realize they are stuck restoring terabytes of data.

In this article, I will walk you through how pg_dump and pg_dumpall actually behave in real environments, when to use each, and what pitfalls to avoid. This is not just syntax—you will get practical, production-focused insights that matter.

Understanding pg_dump in Real Environments

pg_dump is your go-to tool when you want control. It operates at the database level and gives you flexibility to back up exactly what you need.

From a production perspective, this is extremely useful when:

You need to back up a single database without touching others
You want to extract only specific schemas or tables
You are preparing data for migration or testing

A simple example:


pg_dump -U admin -d company -f company_backup.sql

This creates a plain SQL file that can recreate the database structure and data.

Choosing the Right Output Format

One of the most overlooked decisions is the dump format. It directly impacts restore flexibility.

Plain SQL (default)
Easy to read, editable, restored via psql
Custom format (-F c)
Compressed, flexible restore using pg_restore
Directory format (-F d)
Best for large databases and parallel operations
Tar format (-F t)
Portable but less flexible than directory

Example:


pg_dump -U admin -d company -F d -f company_backup

When to Use It in Production

Schema refresh between environments
Partial restores (e.g., only one table)
Data migration across versions or platforms

Caveat

Plain SQL dumps look simple, but restoring them on large databases can be painfully slow.

Selective Backup: The Feature That Saves You

One of the biggest advantages of pg_dump is selective backup. You can target specific objects instead of dumping everything.

Example:

Dump only tables related to orders


pg_dump -U admin -d company -t '*order*' -f company_backup.sql

Schema-only backup:


pg_dump -U admin -d company --schema-only -f schema.sql

Data-only backup:


pg_dump -U admin -d company --data-only -f data.sql

When to use ?

In one production scenario, a reporting schema was corrupted due to a bad deployment. Instead of restoring the full database, we restored only that schema -- saving hours of downtime.

Whats the Risk

Selective restores can break dependencies if you are not careful (foreign keys, sequences, extensions).

Remote and Parallel Dumps: Speed vs Stability

Remote Backups

You can run pg_dump from a different server:


pg_dump -h db-host -U admin -d company -f backup.sql

This is useful when:

You do not want load on the DB server
You centralize backup operations

Parallel Dumps

For large databases, parallelism is a game changer:


pg_dump -U admin -d company -F d -j 4 -f backup_dir

What Happens in Reality

Results in Faster backups., but
Higher CPU and memory usage &
More connections to the database

My Recommendation

Never enable parallel jobs blindly in production. Test the impact during peak load.

pg_dumpall: When You Need Everything

pg_dumpall operates at the cluster level. It is not just about databases - it includes:

All databases
Roles (users)
Tablespaces

Basic usage: pg_dumpall -U postgres > all_databases.sql

When It Makes Sense

When you wish to have a Full cluster backup., or
Migration to a new server
Disaster recovery preparation

Important Behavior

Always produces plain SQL
Must be restored using psql (not pg_restore)
Requires superuser privileges

Example: Backup roles only


pg_dumpall --roles-only > roles.sql

Compression and Streaming Backups

In real systems, storage and transfer time matter. You can compress backups on the fly:


Example : pg_dumpall | gzip > backup.sql.gz

Restore: gunzip -c backup.sql.gz | psql -U postgres

Why This Matters

It reduces storage footprint & Speeds up backup transfers
Useful for cloud uploads

Caveat : Compression saves space but increases CPU usage.

Restore Strategy: Where Most DBAs Go Wrong

Backing up is easy. Restoring under pressure is not.

Restore Plain SQL


psql -U postgres -f backup.sql

Key Things to Remember

pg_dumpall backups cannot use pg_restore
Order of restore matters (roles → schemas → data)
Permissions and ownership can break if not handled

Quick Takeaways

pg_dump is best for database-level and selective backups
Use directory format for large databases and parallel dumps
pg_dumpall is ideal for full cluster backups
Always test restore, not just backup
Parallel jobs improve speed but increase resource usage
Compression is useful but CPU intensive
Selective restores can introduce dependency issues

Let me be blunt - most backup failures are not due to tools, but due to assumptions.

Teams take backups but never validate restores
Roles and permissions are often ignored until restore time
Parallel dumps can overload already stressed systems
Large SQL dumps can take hours to restore, even if backup was fast

Monitoring Tips

Track backup duration trends
Monitor WAL growth during long dumps
Watch connection spikes during parallel jobs

Common Mistake

Running pg_dumpall in production without understanding its impact -- this can hold locks longer than expected.

Mini Case Study: When Partial Restore Saved Hours

We had a 2 TB PostgreSQL database where a developer accidentally truncated a critical table.

Instead of restoring the full database:

We used a previous pg_dump backup
Extracted only the affected table
Restored it into production

Downtime avoided of ~4 hours. & Actual recovery time took: 20 minutes

Lesson learnt: Granular backups are lifesavers.

Conclusion

Logical backups in PostgreSQL are not just a checkbox -- they are a strategic tool in your recovery plan.

pg_dump gives you precision. pg_dumpall gives you completeness. Knowing when to use each is what separates a reactive DBA from a prepared one.

In production, your focus should not just be on taking backups, but on how quickly and safely you can restore. That means:

Testing restores regularly
Choosing the right format
Understanding dependencies
Monitoring system impact

If you have not reviewed your backup strategy recently, now is a good time. Always run a restore test, validate your assumptions, document them as KB articles and make sure your backups actually work when it matters.

Learn DBA : A Life Long Learning Experience

Tuesday, 31 March 2026