Friday, 12 June 2026

PostgreSQL HugePages Explained for DBAs

   If you have spent years tuning Oracle Database, HugePages is probably second nature to you. You would not even think about running a large SGA without it.

Now, when moving into PostgreSQL, many DBAs assume memory works differently or that HugePages are optional. Technically, they are optional. Practically, ignoring them in a serious production system is a mistake I have seen more than once.


PostgreSQL relies heavily on shared memory, especially for its buffer cache. As systems scale and memory grows into tens or hundreds of GB, the way Linux manages memory pages starts to matter a lot. That is where HugePages step in.

In this article, I will walk through how HugePages behave in PostgreSQL, how they differ from Oracle, and what actually matters when you enable them in real production environments. More importantly, I will share the kind of operational lessons you only learn after seeing systems misbehave at 2 AM.


How PostgreSQL Uses HugePages

Unlike Oracle, PostgreSQL does not apply HugePages everywhere. Its usage is focused and intentional.

The primary consumer is the shared buffer cache, controlled by shared_buffers. This is where PostgreSQL caches data blocks, and in most production systems, this is a significant chunk of memory.

When HugePages are enabled, PostgreSQL allocates this shared memory using large pages instead of standard 4 KB pages. The impact is subtle but powerful. Fewer pages mean smaller page tables, fewer TLB misses, and lower CPU overhead.

In addition, newer versions of PostgreSQL can extend HugePages usage to dynamic shared memory areas, especially for parallel query execution. This becomes relevant in systems that rely heavily on parallelism.


When does this matter in production?
If your database is running with large shared_buffers (say 16 GB or more), or if you are running analytics workloads with parallel queries, HugePages can noticeably reduce CPU pressure.


What it does not cover:
Memory allocated via work_mem or maintenance_work_mem is not backed by HugePages. That memory is still handled through regular allocation, which surprises many Oracle DBAs.



HugePages vs Oracle SGA: What Is Different

The biggest conceptual difference is scope.

In Oracle, HugePages essentially back the entire SGA. It is a core part of memory architecture. In PostgreSQL, it is more selective.

Another important difference is behavior during startup. PostgreSQL has a safety mechanism built in. By default, it tries to use HugePages, but if the system is not configured properly, it quietly falls back to normal pages.

This behaviour is controlled by the huge_pages parameter:

  • try ; attempt to use HugePages but continue if unavailable
  • on ; enforce HugePages and fail startup if not available
  • off ; do not use HugePages

In production, I strongly recommend using on once you are confident in your configuration. Silent fallback can hide misconfiguration for months.

Operational caveat:
We have seen systems running for weeks assuming HugePages are active, only to discover they were not due to a minor miscalculation in reserved pages.


Why HugePages Improve Performance

This is where things get interesting from a performance standpoint.

Linux memory management uses page tables to map virtual to physical memory. With standard 4 KB pages, large memory systems require massive page tables. This increases overhead and can impact CPU efficiency.

HugePages, typically 2 MB (or even 1 GB), drastically reduce the number of pages required.

From a DBA perspective, the benefits show up as:

  • Reduced CPU usage under memory-heavy workloads
  • Lower latency due to fewer TLB misses
  • More stable performance during peak load

In some environments, especially analytics-heavy systems, I have seen improvements in the range of 10 to 20 percent. Not because queries got smarter, but because memory handling became more efficient.

There is also a protection angle. HugePages are not swappable. This reduces the risk of critical PostgreSQL memory being pushed out under memory pressure, which in turn reduces the chance of the Linux OOM killer terminating your database.


Getting the Configuration Right

This is where most issues happen.


Step 1: Estimate Required HugePages

Modern PostgreSQL versions make this easier:

postgres -C shared_memory_size_in_huge_pages

This gives you a direct estimate of how many HugePages are required for  your configuration.


Step 2: Reserve HugePages at OS Level

You need to configure the Linux kernel:

sysctl -w vm.nr_hugepages=XXXXXs


To persist:

echo "vm.nr_hugepages=XXXXX" >> /etc/sysctl.conf

If this value is too low, PostgreSQL will not be able to allocate HugePages.

Step 3: Enable in PostgreSQL

huge_pages = on ( Restart is required.)


Key Parameters to Watch

  • shared_buffers ; must fit into HugePages allocation
  • huge_page_size ; allows selecting page size in newer versions
  • min_dynamic_shared_memory ; enables HugePages for parallel workloads

Common mistake: DBAs configure PostgreSQL correctly but forget OS-level reservation. PostgreSQL does not fix this for you.



Transparent HugePages: Why You Should Disable Them

Linux has a feature called Transparent HugePages (THP). It sounds helpful, but in database workloads, it often introduces unpredictable latency.

THP tries to dynamically allocate large pages, which can cause pauses and memory defragmentation overhead.

For PostgreSQL, this results in performance jitter rather than improvement.

Best practice is simple: disable THP at the OS level and rely on explicitly configured HugePages.



Takeaways

HugePages in PostgreSQL are not mandatory, but ignoring them in large systems is risky. They primarily benefit shared memory, especially shared_buffers, and reduce CPU overhead in memory-intensive workloads. Unlike Oracle, their scope is limited, which can confuse DBAs transitioning between platforms. Proper OS configuration is critical, and silent fallback behavior can hide misconfigurations. Finally, always disable Transparent HugePages to avoid performance instability.


Bonus 

In real environments, HugePages issues rarely show up during installation. They show up under pressure.

One common problem is partial allocation. PostgreSQL may start, but not all memory is backed by HugePages. You think everything is fine until CPU usage spikes during peak hours.

Another issue is fragmentation. If you try to configure HugePages on a running system without rebooting, allocation may fail even if memory is technically available.

Monitoring is often overlooked. You should actively verify HugePages usage using /proc/meminfo and PostgreSQL logs, not assume it is working.

Also, keep an eye on containerized environments. Kubernetes and container runtimes require explicit HugePages configuration, and defaults usually do not work out of the box.


Mini Case Study: When HugePages Saved a Production System

We had a reporting database running on a 128 GB server. Everything looked fine until month-end processing kicked in. CPU usage would spike, and query latency became unpredictable.

Initial suspicion was poor query plans. After digging deeper, we found excessive page table activity and high TLB misses.

HugePages were configured in PostgreSQL but not properly reserved at the OS level. PostgreSQL had silently fallen back to standard pages.

After correctly reserving HugePages and enforcing huge_pages = on, CPU usage dropped significantly, and batch processing stabilized.

No query changes. No hardware upgrade. Just proper memory configuration.


Conclusion

HugePages in PostgreSQL are one of those features that quietly make a big difference. They do not change query logic, indexing strategy, or execution plans, but they significantly improve how efficiently your database uses memory.

For small systems, you might get away without them. But for any serious production workload, especially those with large memory footprints, they should be part of your standard configuration checklist.

The key is not just enabling them, but verifying that they are actually in use. That means aligning PostgreSQL settings with OS-level configuration and validating after every restart.

If you are running PostgreSQL in production today, take a few minutes to check your HugePages configuration. It is one of the simplest optimizations that can deliver measurable stability and performance gains.


FAQs

1. Does PostgreSQL require HugePages to start?
No, unless huge_pages = on is set. Otherwise, it will fall back silently.

2. Do HugePages improve query performance directly?
Not directly. They improve memory efficiency, which indirectly boosts performance.

3. Are HugePages used for work_mem?
No. Only shared memory areas like shared_buffers use HugePages.

4. What happens if HugePages are misconfigured?
PostgreSQL may either fail to start or silently revert to normal pages depending on configuration.

5. Should HugePages be used in cloud environments?
Yes, but you must ensure the cloud VM and OS support proper reservation.





What's Your Experience?

Have you enabled HugePages in your PostgreSQL environments? 

Did you notice measurable performance gains or run into configuration challenges?

Would be interesting to hear how others are handling this in production.




Thursday, 11 June 2026

Oracle 26ai Read-Only Users and Sessions

One very common production request sounds simple:

"Can we give this user access, but make sure they cannot change anything?"

Before Oracle AI Database 26ai, DBAs usually handled this by creating a separate reporting user, granting only SELECT, removing DML privileges, or depending on carefully designed roles. That approach works when access is cleanly designed from the beginning. But in real production systems, users often collect privileges over time. Support users get emergency grants, application users may have broader access than expected, and batch accounts sometimes have privileges that nobody wants to touch during an incident.



Sunday, 7 June 2026

PostgreSQL Performance Tuning That Survives Production

    A PostgreSQL performance issue rarely starts with one bad setting.

In production, it usually looks like this: the application team says the database is slow, CPU is not always high, storage graphs look confusing, and nobody changed anything “major”. Then we check deeper and find long transactions, dead tuples, stale statistics, unused indexes, chatty application queries, or checkpoint pressure.



Thursday, 4 June 2026

PostgreSQL VACUUM: Bloat, Autovacuum and Real Fixes

    A PostgreSQL table can grow quietly for weeks before anyone notices. The application team says they already deleted old data. Storage still looks high. Queries are touching more blocks than expected. Autovacuum is running, but the table does not seem to become smaller. This is where many DBAs first realize that DELETE in PostgreSQL is not the same as physically removing rows from the table file.

PostgreSQL uses MVCC, so old row versions remain inside the table until VACUUM can clean them. This is normal behavior, not a bug. The problem starts when dead tuples grow faster than VACUUM can remove them, or when long-running transactions prevent cleanup. Then you get table bloat, index bloat, stale statistics, poor plans, unnecessary I/O, and sometimes transaction ID wraparound pressure.



Wednesday, 3 June 2026

My Oracle ACE Journey: From Practical Sharing to Oracle ACE Pro

  Some journeys do not begin with a plan. They begin with curiosity, consistency, and a simple intention to share what we learn.

My Oracle ACE journey is one such journey.



Sunday, 31 May 2026

Recognized Among FeedSpot’s Top 70 Database Blogs to Follow in 2026

I am happy to share that Learn DBA : A Life Long Learning Experience has been selected by FeedSpot as one of the Top 70 Database Blogs to Follow in 2026.



Saturday, 9 May 2026

RMAN Restartable Backups Explained for Production DBAs

Most DBAs have seen this at least once. You wake up, check the overnight backup report, and RMAN failed halfway through a 14 TB database backup because the backup filesystem filled up, a network mount disconnected, or one RAC node crashed during the run.



Sunday, 26 April 2026

ORA-01017 in RAC 12c and above ? Stepwise Permission Fix & Cause identification

  As an Oracle DBA, few things are more frustrating than a sudden loss of remote connectivity right after a routine SYS password reset. You type in the credentials, and bam -- ORA-01017 greets you, even though your local connections work fine. In production RAC environments, this isn’t just about a mistyped password.

Recently, I faced a tricky scenario in an Oracle 19c RAC setup with proper role separation between the grid and oracle OS users. What seemed like a simple password mismatch quickly unraveled into a multi-layered “permission deadlock,” involving rogue listeners, contaminated IPC sockets, and GPnP directory access issues. It took a careful, stepwise approach to restore connectivity across all nodes without compromising the cluster.



Saturday, 18 April 2026

Oracle RAC Load Balancing Internals Explained

 Most RAC clusters look healthy until the workload shifts suddenly.

A reporting job starts hammering one node. Connection pools keep sending sessions to the same instance. CPU climbs, gc waits spike, application response times become unpredictable, and suddenly everyone starts blaming storage, SQL plans, or the network.

But many times, the real problem sits in the RAC connection routing layer itself.

I have seen large RAC environments where all nodes were technically UP, yet one instance was drowning while another sat nearly idle. The cluster wasn’t failing. The load balancing strategy was.

Oracle RAC load balancing is often misunderstood because people assume SCAN alone magically distributes workload intelligently. It does not.



Saturday, 4 April 2026

Oracle Data Pump Migrations: What Breaks in Real Upgrades

Most DBAs discover the real complexity of Oracle database migrations only after the first large-scale Export/Import cutover goes sideways. On paper, Oracle Data Pump looks straightforward. Export the database, import it into a newer release, validate the objects, switch applications, done.

Reality is usually messier.



Tuesday, 31 March 2026

PostgreSQL Backup Guide: pg_dump vs pg_dumpall

 If you have ever been in a production outage where a single schema was corrupted or a developer dropped the wrong table, you already know one thing ; your backup strategy is only as good as your restore flexibility.

In PostgreSQL, logical backups using pg_dump and pg_dumpall are your first line of defense for granular recovery. Unlike physical backups, they give you the ability to restore specific objects, migrate databases across environments, and even troubleshoot data inconsistencies without touching the entire cluster.



Friday, 20 March 2026

Essential Oracle Database Keywords for DBAs - Part 2

When you are preparing for an Oracle DBA interview — or troubleshooting a production issue at 2 AM — definitions alone are not enough. You need clarity. You need context. And most importantly, you need to understand how Oracle behaves under pressure.



Sunday, 15 March 2026

When Oracle Uses PGA Instead of SGA for Large Table Scans

Most Oracle DBAs learn early in their careers that the System Global Area (SGA) is the primary memory structure used for caching data blocks. The assumption is simple: data blocks are read from disk, placed in the buffer cache, and reused by future sessions.

However, that assumption doesn’t always hold true in production.

In many real-world workloads , especially analytics queries, reporting jobs, or large batch processing-- . Oracle intentionally bypasses the buffer cache and reads data directly into the Program Global Area (PGA). This behaviour surprises many DBAs when they notice unusual wait events like direct path read, or when frequently queried tables seem to generate repeated disk I/O.



Monday, 9 March 2026

Adding an Instance to Oracle RAC: Step-by-Step Guide for DBAs

 Expanding your Oracle Real Application Clusters (RAC) database by adding a new instance can be intimidating, especially if you want to ensure zero downtime and seamless integration. Whether you're introducing a new node or simply adding an instance to an existing one, the process involves careful planning, configuration, and execution. In this guide, we’ll walk you through every step, from preparing the environment to verifying the new instance’s operation. By the end of this article, you will have a clear understanding of both DBCA and SRVCTL methods for adding instances, best practices for shared storage and network setup, and tips to avoid common pitfalls.



Thursday, 5 March 2026

Oracle Utilities for DBAs: Complete Guide to Essential Tools

 Managing an Oracle Database efficiently requires mastering a variety of utilities designed for administration, data movement, performance tuning, and troubleshooting. Whether you’re performing backups, moving data, monitoring performance, or tuning SQL queries, Oracle provides tools to streamline every task. In this guide, we’ll cover the most commonly used Oracle utilities, grouped by functionality, with practical examples and best practices for real-world DBA scenarios.



Monday, 2 March 2026

Understanding Oracle 19c DML Internals for OLTP Performance

For Oracle DBAs managing high-volume OLTP (Online Transaction Processing) systems, understanding how core DML (Data Manipulation Language) operations function under the hood is essential. SELECT, INSERT, UPDATE, and DELETE statements are the backbone of any database, but their internal mechanics—parsing, execution, undo/redo generation, buffer cache interactions, and transaction control—can significantly impact system performance.



Saturday, 28 February 2026

Oracle 26ai ASC: Fast Loads Without HCC Pain

 Most DBAs who run large Oracle data warehouses already know the trade-off.

The ETL team wants maximum direct-path load speed because nightly ingestion windows keep shrinking. The storage team wants Hybrid Columnar Compression (HCC) everywhere because storage costs are exploding. Meanwhile, DBAs get stuck in the middle trying to balance ingest performance, archive growth, Smart Scan efficiency, backup footprint, and query response times.



Oracle Backup Success Story : Predictable Backups, Confident Restores - Oracle Features for Modern VLDBs

 When we talk about database growth, we usually celebrate it. But growth without backup redesign is silent risk.

This was the story of a 60.5 TB Oracle production database, where individual datafiles had grown between 500GB to 800GB+, and backups were quietly destabilizing the entire ecosystem.

What started as a "long backup" issue turned into something much deeper.



Friday, 27 February 2026

Essential Oracle Database Keywords Every DBA Should Know - Part 1

 Whether you are just stepping into the world of Oracle databases or have years of experience managing complex environments, understanding the foundational keywords and concepts is crucial. Oracle databases come with a rich ecosystem of terms, from memory structures and wait events to transaction control and performance monitoring. This guide walks you through essential Oracle database keywords, explaining each term in plain language with practical examples. You’ll learn not only what these concepts mean but also how they impact daily database operations, troubleshooting, and performance tuning. By mastering these terms, freshers gain a strong starting point, while seasoned DBAs can refresh and refine their knowledge. 

In this first installment, we cover critical keywords ranging from Buffer, Cache, and Parsing, to Data Pump and SQL Plan Baselines, giving you a solid foundation for Oracle administration.



Monday, 23 February 2026

SQL Query Tuning in Oracle: A Practical Guide for DBAs

 If you're an Oracle DBA, you already know this feeling: a message pops up — “The application is slow.” No context. No logs. Just urgency.

And more often than not, the root cause comes down to a poorly performing SQL query.

SQL tuning in Oracle isn’t just about adding an index or running the SQL Tuning Advisor. It’s about following a structured, evidence-based approach that eliminates guesswork. Over the years, I’ve realized that the biggest difference between average and effective SQL query tuning lies in discipline — knowing what to check, in what order, and why.



Monday, 16 February 2026

Oracle Data Archiving Best Practices Guide

As an Oracle DBA, you already know this feeling - the database keeps growing, storage keeps expanding, backups take longer, maintenance windows shrink, and suddenly performance complaints start coming in.

Handling large data volumes isn't just about adding more disks or increasing SGA. It's about implementing Oracle Data Archiving best practices that balance performance, cost, compliance, and scalability.



Monday, 9 February 2026

How to Boost Oracle Data Pump Performance for Faster IMPDP Operations

 Oracle Data Pump (IMPDP) is a powerful utility for moving data between Oracle databases, but large-scale imports and exports can often be slow and resource-intensive. Whether you’re managing a standalone database or handling complex LOB-heavy schemas, improving Data Pump performance is key to reducing downtime and ensuring smooth operations. In this guide, we’ll explore practical strategies to enhance IMPDP performance, including parallelism, network-based imports, LOB optimizations, and buffer tuning.



Wednesday, 4 February 2026

VLDB Backup Optimization – RMAN Concepts Revision for Interviews

 

Q 1: What is the main challenge when backing up a multi-terabyte Oracle database with very large datafiles?



Monday, 26 January 2026

Republic Day Reflections for DBAs: Oracle, PostgreSQL, MSSQL, and the Art of Governance

As India celebrates 26th January - Republic Day, it’s a perfect moment for DBAs to draw parallels between national governance and database management. Just as the Constitution defines rules, responsibilities, and structures for our nation, robust databases rely on architecture, policies, and governance to thrive.



Friday, 23 January 2026

Oracle Hybrid Partitioned Tables in Real Systems

 Most large Oracle environments eventually hit the same wall.

The database keeps growing, storage costs climb every quarter, backup windows become unpredictable, and suddenly DBAs are spending more time managing historical data than supporting the actual application workload. 

Usually the first reaction is archive-and-delete.

Then the reporting team complains because last year's data disappeared.