Sunday, 7 July 2024

The Curious Case of "No Space Left on Device" in PostgreSQL

 Hey there, PostgreSQL adventurers! 🌟 Ever encountered the mysterious error message "No space left on device" while trying to start your Postgres instance or connect to your beloved database,Only to find that your disk space is as abundant as ever? Fear not, for I'am about to embark on a thrilling journey to uncover the hidden culprit behind this puzzling phenomenon. Let's dive in!


The Strange Encounter

Picture this: You're all set to fire up your Postgres instance or welcome users to connect to your database, but alas! The dreaded error message strikes: "No space left on device." 😨Panic sets in as you frantically check your disk space, only to find that there's plenty to spare. What in the world could be going on? πŸ€”


Enter the Inode Mystery

Ah-ha! The plot thickens as we uncover the truth behind this enigma. It turns out that our Postgres instance has run out of inodes, those elusive creatures responsible for managing file metadata. But fear not, dear reader, for we're about to shed light on this shadowy realm of logical replication and snapshot files. πŸ•΅️‍♂️πŸ’‘


Investigating the Scene

First, let's take a peek under the hood with a couple of trusty commands:

df -i /mnt/disks/postgresql 

Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb 12127100 12127100 0  100%   /mnt/disks/postgresql 



Lo and behold! Our inodes are maxed out at 100%. But fear not, for there's hope yet!

$ df -h | grep postgresql 

/dev/sdb 250G 72G 178G 29% /mnt/disks/pgsql


Ah, there it is – our disk space, still as spacious as ever at 28%. The plot thickens!


Unveiling the Culprit

But wait, what's this? A trail of old pglogical snapshot files littering the landscape, each one a clue to our inode conundrum. Could it be that these remnants of logical replication are clogging up our precious inodes?


ls /mnt/disks/postgresql/pg_logical/snapshots | wc -l

A staggering 17,470,492  files – the smoking gun in our investigation!


Finding the Solution

But fear not, fellow database detectives, for we hold the key to resolving this mystery:


Fix the Replication Issue: Check pg_stat_replicationpg_replication_slots, and friends for any anomalies.

Drop Replication Slots: If your downstream subscriber is lagging or no longer needs to consume changes, bid farewell to those replication slots.

Expand Your Disk Space: When all else fails, a little extra room never hurts.

Delete Old Snapshot Files: Clear out those dusty old snapshots to free up those precious inodes once and for all.


Postgres 15 to the Rescue

And for those daring souls venturing into the world of Postgres 15, fear not – help is on the way with pg_ls_logicalsnapdir(),  your trusty companion for navigating snapshot files with ease.


Conclusion

And there you have it, dear readers! The mystery of "No Space Left on Device" unraveled before our very eyes. Armed with this newfound knowledge, may you venture forth into the world of PostgreSQL with confidence and clarity. Until next time, happy querying!


Stay tuned for more PostgreSQL adventures, and don’t forget to share your thoughts and questions in the comments below! 


Regards,

Nikhil





No comments:

Post a Comment