Wednesday, 3 August 2016

How to Recover archive gaps in standby database - using 2 methods




Using Both Methods


1.       Manually Log Shipping (when the missing logs are very less)
2.       Performing Incremental Backup (When there is a very huge gap)

METHOD 1:

When the logs missing or corrupt is less in number (say below 15), we can ship the logs which were missing in the standby site from the primary site (scp/sftp/ftp) and then we can register the log file in the standby so that the gap can be resolved.
This is easy process if you have missing or corrupt logs in lesser number.

Otherwise we can use the incremental backup strategy, and perform the recovery at standby site.


Lets go through the Archive log Shipping process

First, Find the archives which are missing by issuing the following command. This would give the gap sequences
SQL> select * from v$archive_gap


Or you can use the v$managed_standby view to find where the log apply stuck.
SQL> select sequence#,process,status from v$managed_standby;

Now, Copy the logs to the standby site from the primary site
Using the below command

At standby site, Do the log file registration at the standby site until all the missing log files are registered, Use this below command.
SQL> alter database register logfile '/log/file/location/log_file_name_n.arc';

Now apply would take place and your standby will become sync with the primary.


METHOD 2 :
when the difference is huge (say around 500 logs) the above method is very time consuming and not a proper approach. Else you have to rebuild the standby database from scratch.
As an enhancement from 10g, an incremental backup created with BACKUP INCREMENTAL... FROM SCN can be used to refresh the standby database with changes at the primary database since the last SCN at Standby and then managed recovery can resume.

Step 1:
Use this below command to find the SCN difference, on both the database – Primary DB & Standby DB
SQL> select current_scn from v$database;


Step 2 :
Stop the managed standby apply process:
SQL> alter database recover managed standby database cancel;

Step 3: 
Now Shutdown the standby database
SQL> shut immediate

Step 4:  
On the primary, take an incremental backup from the SCN number where the standby has been stuck:
RMAN> run {
allocate channel c1 type disk format '/u01/backup/%U.bkp';
backup incremental from scn ********* database;
}

Step 5:  On the primary, create a new standby controlfile and copy this file to standby side:
SQL> alter database create standby controlfile as '/u01/backup/for_standby.ctl';

$ scp * oracle@dataguard : /u01/backup


Step 6 :
Bring up the Standby instance in nomount mode:
SQL> startup nomount

Step 7
Now, replace the previous controlfile with this new one created at primary , and Bring the database to MOUNT state.
(Because, this Primary Side control file has the information of the SCN and we have to perform the recovery using this controlfile)

SQL> alter database mount standby database;

Step 8 :
Open the RMAN prompt and Catalog the backup piece.
(Because , RMAN does not know about these files yet; so you must let it know – by a process called cataloging)
$ rman target=/
RMAN> catalog start with '/u01/backup';

Step 9 :
Recover these files:
RMAN> recover database;

Step 10 :
After performing the recovery , exit RMAN and start managed recovery process:
SQL> alter database recover managed standby database disconnect from session;

Step 11 :

Again Check the SCN’s in primary and standby to make sure that both are in SYNc:
SQL> select current_scn from v$database;


 See Also

Configure Standby in case you re-create redologs of Primary site





1 comment:

  1. Dear Nikhil,

    I get the following error, please help.

    RMAN> recover database;

    Starting recover at 23-AUG-19
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=11 device type=DISK
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of recover command at 08/23/2019 11:05:12
    RMAN-06094: datafile 1 must be restored

    RMAN>

    ReplyDelete