Tuesday 7 July 2020

Configure Streaming Replication in PostgreSQL


By default, Streaming replication is asynchronous meaning, once a transaction is committed on the primary, there is a slight delay when that same transaction is committed, and written on the replica. There is potential for data loss with this type of configuration.


Streaming replication used as distributing data (especially in analytics) provides a 'read-only' replica to take the load off of the primary server.
Within seconds, the data in the standby machine will become consistent with the master machine. Some types of applications can accept this kind of data inconsistency.  


In synchronous replication, none of the transaction is considered complete until it is confirmed by both the master and replica server.
Sync replication is no data loss replication. Each commit of a write transaction will wait until confirmation is received from the replica, that the commit has been written to the write-ahead log on disk of both the primary and standby server.
This means writes on the master have also been confirmed and written on the replica.

Here is the official explanation from section 26.2.8. Synchronous Replication in the official documentation.


We have chosen one of these setup configuration. i.e, Asynchronous replication in this article.


1. Create the replication user and setup replication slots


It is very important that the access privileges for replication are set up so that only trusted users can read the WAL stream
We will create the replication user set the replication permission, set the user to be able to log in, and encrypt the password in the database


postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replicator';
CREATE ROLE

postgres=# ALTER SYSTEM SET listen_addresses TO '*';
ALTER SYSTEM


postgres=# SELECT pg_create_physical_replication_slot('slot2');
ERROR:  replication slots can only be used if max_replication_slots > 0



postgres=# show max_replication_slots ;

 max_replication_slots
-----------------------
 0

(1 row)




postgres=# SELECT pg_create_physical_replication_slot('slot1');

 pg_create_physical_replication_slot
-------------------------------------
 (slot1,)

(1 row)




postgres=# SHOW data_directory;

     data_directory
-------------------------
 /var/lib/pgsql/9.6/data

(1 row)



2. Configure the HBA file on server


Now that we have our user we need to give them permission to actually connect to the primary Postgres server.

We do that in the pg_hba.conf file. In the file, we declare what user, database, and where they are connecting from.
add pg_hba.conf
$ echo "host replication replicator 127.0.0.1/32 md5" >> pg_hba.conf


3. Configuration for Creating Archive Files for Replication


One of the most critical parts of configuring replication is the saving of data to archive files so they can be pulled over to the replica


listen_addresses = '*'
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cp %p /var/lib/pgsql/9.6/data_replica/pg_wal/%f'
max_wal_senders = 1
hot_standby = on


4. Restart the primary server so that changes take place



[root@oracleasm1 bin]# service postgresql-9.6 restart

Stopping postgresql-9.6 service:                           [  OK  ]
Starting postgresql-9.6 service:                           [  OK  ]

[root@oracleasm1 bin]# su - postgres


5. Replicating the Initial database:


Backup the master from the slave server. 

The backup copies settings from the master, including its TCP port values at slave (i.e, /var/lib/pgsql/9.6/data_replica )

on slave :

-bash-4.1$ PGPASSWORD=replicator pg_basebackup -S slot1 -h 127.0.0.1 -U replicator -p 5432 -D /var/lib/pgsql/9.6/data_replica -Fp -P -Xs -Rv

pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed

transaction log start point: 0/15000058 on timeline 2
pg_basebackup: starting background WAL receiver
403273/403273 kB (100%), 1/1 tablespace

transaction log end point: 0/1500014C
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed



Note: Make sure you have edit/changed the below 2 parameters in postgresql.conf of the new cluster.

port = 5434
standby_mode = on


Else it will fail with the below errors :


-bash-4.1$ /usr/pgsql-9.6/bin/pg_ctl  -D /var/lib/pgsql/9.6/data_replica start

user=,db=,9757LOG:  could not bind IPv6 socket: Address already in use
user=,db=,9757HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.


-bash-4.1$ psql -p 5434

psql: FATAL:  the database system is starting up




6. Start the server



-bash-4.1$  /usr/pgsql-9.6/bin/pg_ctl  -D /var/lib/pgsql/9.6/data_replica start

server starting

-bash-4.1$ user=,db=,10033LOG:  redirecting log output to logging collector process

user=,db=,10033HINT:  Future log output will appear in directory "pg_log".


7. Validate if Test-Replication is working



postgres=# SELECT redo_location, slot_name,checkpoint_location,
postgres-# round((redo_location-checkpoint_location) / 1024 / 1024 / 1024, 2) AS GB_behind
postgres-# FROM pg_control_checkpoint(), pg_replication_slots;

+-[ RECORD 1 ]--------+------------+

| redo_location       | 0/38B58FE0 |
| slot_name           | slot1      |
| chekpoint_location  | 0/38B59014 |
| gb_behind           | 0.00       |
+---------------------+------------+



5432 postgres@ postgres# select application_name, backend_start,state, sync_priority, sync_state from pg_stat_replication;

+-[ RECORD 1 ]-----+----------------------------------+

| application_name | walreceiver                      |
| backend_start    | 2020-03-25 17:14:27.338801+05:30 |
| state            | streaming                        |
| sync_priority    | 0                                |
| sync_state       | async                            |
+------------------+----------------------------------+





Excerpt of recovery.conf file


-bash-4.1$ cat recovery.conf

standby_mode = 'on'

primary_conninfo = 'user=replicator password=replicator host=127.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'

primary_slot_name = 'slot1'


recovery_target_timeline

If we use recovery_target_timeline='latest' then it will use the timeline which was found in the archive, but in my environment archive is not a shared mount point

If you want a switch-back functionality, you will have to set recovery_target_timeline='latest', as any promotion will increment the timeline.

The recovery_target_timeline is used to specify a timeline in the event that you are recovering after having already previously performed a point-in-time recovery, which creates divergent timelines.

Now let’s see if we can insert  data into the table on our slave server:


5434 postgres@ postgres# INSERT INTO Course_Index VALUES ('12');

ERROR:  25006: cannot execute INSERT in a read-only transaction

LOCATION:  PreventCommandIfReadOnly, utility.c:236

5434 postgres@ postgres#


As you can see, we are unable to insert data into the slave.  This is because the data is only being transferred in one direction. 
In order to keep the databases consistent, postgres must make the slave read-only.


This is just the tip of the iceberg with regards to what you can do with replication of postgresql.
You might also want to take a look at

#archive_cleanup_command = '/usr/pgsql-9.6/bin/pg_archivecleanup /var/lib/pgsql/9.6/data/pg_xlog/ %r'




Important system variables

archive_mode is not required for streaming replication with slots.

wal_level – is a replica by default

max_replication_slots – 10 by default.

archive_command – not important for streaming replication with slots

hot_standby – set to on by default, important to enable reads on slave

listen_addresses – the only option that it’s necessary to change, to allow remote slaves to connect

max_wal_senders – set to 10 by default, a minimum of three for one slave, plus two for each additional slave

wal_keep_segments – 32 by default, not important because PostgreSQL will keep all segments required by slot







Summary


- Always use meaningful names for slots, as that will simplify debug.

- Avoid long downtime for slave servers with slots configured.

- Check pg_replication_slots restart_lsn value and compare it with current redo_lsn.

- Be careful with orphaned slots. PostgreSQL will not remove WAL segments for inactive slots with initialized restart_lsn.

- Physical replication setup is really easy with slots. By default in pg10, all settings are already prepared for replication setup.

6 comments:

  1. Standard visits recorded here are the simplest strategy to value your vitality, which is the reason why I am heading off to the site regular, looking for new, fascinating information. Many, bless your heart!

    PMP

    ReplyDelete
  2. I like this post,And I figure that they making some incredible memories to scrutinize this post,they may take a good site to make an information,thanks for sharing it to me
    https://360digitmg.com/course/project-management-professional-pmp

    ReplyDelete
  3. This is a great motivational article. In fact, I am happy with your good work. They publish very supportive data, really. Continue. Continue blogging. Hope you explore your next post
    hrdf contribution

    ReplyDelete
  4. I see the best substance on your blog and I unbelievably love getting them.
    https://360digitmg.com/india/data-science-using-python-and-r-programming-noida

    ReplyDelete
  5. very good article 5 stars..

    ReplyDelete