Switch-over of a master database to replica is pretty easy. Lets see how we do it..
I have 2 servers : 5522(data_replica2) and 5432(data), and at this time, 5522 is the master server and 5432 is a warm standby with streaming replication.
The initial setup looks something like this as below image. I can write the data to 5522, where 5432 is used for read-only and I cannot perform any operation.
I have 2 servers : 5522(data_replica2) and 5432(data), and at this time, 5522 is the master server and 5432 is a warm standby with streaming replication.
The initial setup looks something like this as below image. I can write the data to 5522, where 5432 is used for read-only and I cannot perform any operation.
("warm standby" is usually
used to refer to a second server which is keeping up to date, but not accepting
connection, used as read-only)
The proper way to promote a standby to
a primary is by using the trigger file or running pg_ctl promote
-bash-4.1$ /usr/pgsql-9.6/bin/pg_ctl promote server promoting Standby has been promoted as master and a new timeline followed which you can notice in logs user=,db=,28043LOG: received promote request user=,db=,28043LOG: redo done at 0/77000024 user=,db=,28043LOG: last completed transaction was at log time 2020-04-08 01:20:40.119967+05:30 user=,db=,28043LOG: selected new timeline ID: 4 user=,db=,28043LOG: archive recovery complete user=,db=,28043LOG: MultiXact member wraparound protections are now enabled user=,db=,28044LOG: checkpoint starting: force user=,db=,28041LOG: database system is ready to accept connections user=,db=,30934LOG: autovacuum launcher started user=,db=,28044LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.040 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=15119 kB
Promoting the standby will make it pick a new timeline, so we will need to add the parameter - recovery_target_timeline = 'latest' if we want the new standby to follow that timeline switch.
Otherwise, You may end up with errors as below
when a new timeline is forked.
(Read here)
(Read here)
cp: cannot stat `pg_xlog/RECOVERYXLOG': No such file or directory cp: cannot stat `pg_xlog/RECOVERYXLOG': No such file or directory user=,db=,9262FATAL: requested timeline 3 is not a child of this server's history user=,db=,9262DETAIL: Latest checkpoint is at 0/6A000024 on timeline 2, but in the history of the requested timeline, the server forked off from that timeline at 0/690000C0. user=,db=,9259LOG: startup process (PID 9262) exited with exit code 1
In this situation, The basic thing is you can't switch
these other than:
- Turn off primary
- Promote secondary
- Rebuild the former primary as new secondary
off the new primary.
Step 1: Perform a clean shutdown of Primary[5522]
-bash-4.1$ /usr/pgsql-9.6/bin/pg_ctl -D
/var/lib/pgsql/9.6/data_replica2 stop
Check for sync status and recovery
status of Standby[5432] before promoting it:
-bash-4.1$ psql -p 5432 -c 'select pg_last_xlog_receive_location() "receive_location", > pg_last_xlog_replay_location() "replay_location",pg_is_in_recovery() "recovery_status"
;'
receive_location | replay_location | recovery_status ------------------+-----------------+----------------- 0/7700008C | 0/7700008C | t (1 row)
Standby in complete sync. At this stage
and we are safe to promote it as Primary.
Step 2: Open the Standby as new Primary by
pg_ctl promote or creating a trigger file.
-bash-4.1$ /usr/pgsql-9.6/bin/pg_ctl promote server promoting user=,db=,28043LOG: received promote request user=,db=,28043LOG: redo done at 0/77000024 user=,db=,28043LOG: last completed transaction was at log time 2020-04-08 01:20:40.119967+05:30 user=,db=,28043LOG: selected new timeline ID: 4 user=,db=,28043LOG: archive recovery complete user=,db=,28043LOG: MultiXact member wraparound protections are now enabled user=,db=,28044LOG: checkpoint starting: force user=,db=,28041LOG: database system is ready to accept connections user=,db=,30934LOG: autovacuum launcher started user=,db=,28044LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.040 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=15119 kB
Standby has been promoted as master and
a new timeline followed which you can notice in logs
Step 3: Edit the recovery.conf file
Restart old Primary as standby and allow to
follow the new timeline by specifying "recovery_target_timline='latest' "
in $PGDATA/recovery.conf file
-bash-4.1$ pwd /var/lib/pgsql/9.6/data_replica2 -bash-4.1$ cat /var/lib/pgsql/9.6/data_replica2/recovery.conf recovery_target_timeline='latest' standby_mode = 'on' primary_conninfo = 'user=replicator password=replicator host=127.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres' primary_slot_name = 'slot1'
In the above recovery.conf file it is clear that old Primary will now try
to connect to 5432 (primary) port as new Standby pointing to common WAL
Archives location.
Step 4: Now, Start the standby and verify the new
standby log :
-bash-4.1$ ls -ltr -rwxrwxrwx 1 postgres postgres 141687 Apr 8 02:14 postgresql-Wed.log user=,db=,31220DETAIL: End of WAL reached on timeline 3 at 0/7700008C. user=,db=,31220LOG: restarted WAL streaming at 0/77000000 on timeline 3 user=,db=,31289LOG: entering standby mode user=,db=,31289LOG: consistent recovery state reached at 0/7700008C user=,db=,31287LOG: database system is ready to accept read only connections user=,db=,31293LOG: started streaming WAL from primary at 0/77000000 on timeline 4 user=,db=,31289LOG: redo starts at 0/7700008C user=[unknown],db=[unknown],31297LOG: connection received: host=[local] . . user=,db=,31290LOG: recovery restart point at 0/770280E4 user=,db=,31290DETAIL: last completed transaction was at log time 2020-04-08 01:49:21.588317+05:30 user=,db=,31290LOG: restartpoint starting: time user=,db=,31290LOG: restartpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.000 s, total=0.422 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=144 kB
Step 5. Validate the new Standby status. We
can use pg_controldata
Perfect.. without any re-setup we have
brought back old Primary as new Standby 😉
Let us try to input some data and see if it replicates to new standby now.
Note :
trigger_file :
While doing Role-switch using trigger
file, specify the parameter in recovery.conf file as below.
trigger_file='/var/lib/pgsql/9.6/data/primary_down.txt'
-bash-4.1$ touch
/var/lib/pgsql/9.6/data/primary_down.txt
This will cause Postgres to:
Notice the file. It finishes off the
slave process and recovers any log files
transferred from the main server that still needs recovery.
Further Renames the 'recovery.conf'
file to 'recovery.done'… when that happens, it is now the primary server.
recovery_target_timeline :
- If we use recovery_target_timeline='latest' then it will use the timeline which was found in the archive.
- If you want the switch-back functionality, you will have to set recovery_target_timeline='latest', as any promotion will increment the timeline.
- The recovery_target_timeline is used to specify a timeline in the event that you are recovering after having already previously performed a point-in-time recovery, which creates divergent timelines.
Further, if you wish to promote your replica to
master (into open; read/write mode),
you will not need to change the hot_standby parameter in postgresql.conf ,
because it will not affect the database
with the primary role.
The rule "host replication
all" in pg_hba.conf file influences
only who could connect to your new primary to create a hot standby database /
streaming replication.
These days pg_rewind can be used to
turn the former master into a standby of the new master much more quickly than
a complete rebuild. Previously rsync or similar would have been a quicker way
to do this.
https://help.theatremanager.com/book/export/html/3679
Read more
Configure Streaming Replication in PostgreSQL
All about Physical Replication and Log shipping in Postgres
Streaming-Replication Sync and Async, benefits of streaming replication over Log-shipping
https://help.theatremanager.com/book/export/html/3679
https://www.enterprisedb.com/blog/switchover-switchback-in-postgresql-9-3
No comments:
Post a Comment