Learn DBA : A Life Long Learning Experience: DataGuard

Showing posts with label DataGuard. Show all posts

Friday 4 February 2022

Tracing Oracle Data Guard

Tracing can also be enabled in dataguard using the parameter "Log_Archive_Trace"

The values can be set using DG-Broker (if configured) or at SQL prompt as below.

RMAN-08591: WARNING: invalid archived log deletion policy

For one of database we were getting backup failure alerts where the backup was configured on standby database.

We saw below Error messsages in backup-log :

RMAN-08591: WARNING: invalid archived log deletion policy

Cause: If archive log location is set to FRA, then there might be a chance of the deleting the archives automatically when the space pressure in FRA.

In that case at least one of the destination standby must be set as a "MANDATORY" destination.

Solution: To eliminate the RMAN warning message, at least one archive destination must be set as a mandatory destination.

Steps :

1. Log in to the broker command line utility

[oracle@host1 log]$ dgmgrl /
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production
Copyright (c) 2000, 2013, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected as SYSDG.
DGMGRL> show configuration

Configuration – primdb1
  Protection Mode: MaxPerformance
  Members:

  primdb1 - Primary database
    bcpdb1 - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 7 seconds ago)

2. For the standby site check the "Binding" property

DGMGRL> show database verbose bcpdb1

Database - bcpdb1

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      11 minutes (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 625.00 KByte/s
  Active Apply Rate:  523.00 KByte/s
  Maximum Apply Rate: 18.62 MByte/s
  Real Time Query:    ON
  Instance(s):
    bcpdb1

Properties:

----------
----------


Binding        = 'optional'

----------
----------

3. Set the "binding" Property as - MANDATORY

//This parameter controls whether the destination is mandatory or not

DGMGRL> edit database bcpdb1 set property Binding='mandatory';

Property "binding" updated

DGMGRL> exit

Once the broker configuration has set a standby site as a mandatory destination, the RMAN configuration can be altered to set the archivelog deletion policy to applied on standby.

RMAN> configure archivelog deletion policy to applied on standby;

Once the broker configuration has set a standby site as a mandatory destination, then RMAN will not report this error again.

Reference: Data Guard Physical Standby - RMAN configure archivelog deletion policy reports RMAN-08591 (Doc ID 1984064.1)

Found this post interesting? Subscribe us 😊😉

Installation and configure Postgres 13 on RHEL

All about Physical Replication and Log shipping in Postgres

Possible ways to recover space from deleted rows with insufficient disk space

Streaming-Replication Sync and Async, benefits of streaming replication over Log-shipping

“Be like a tree. Stay grounded. Connect with your roots. Turn over a new leaf. Bend before you break. Enjoy your unique beauty. Keep growing.” -- Joanne Rapits.

Sunday 12 August 2018

MRP terminated with ORA-00600: internal error code, arguments: [3020] | For Standby database

Today, One of the database was having lag an MRP process was terminating with Internal errors ORA 600 arguments: [3020]

Here, Checked the standby database, the gap was increasing rapidly.

SQL:hostname_standby01:(MYPROD):PHYSICAL STANDBY> SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APP                  L.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"
  2  FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME)                   FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME )
  3  IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ;

 

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18223                 17969        254

 

SQL:hostname_standby01:(MYPROD):PHYSICAL STANDBY>

I was curious to check the alert log to check and know what went wrong and why MRP process keeps on terminating. So, I went through the alert log and I found below details.

hostname_standby01(oracle):MYPROD:trace$ tail -400f alert_MYPROD.log



Errors in file /app/ora/local/admin/MYPROD/diag/rdbms/myprod_hostname_129/MYPROD/trace/MYPROD_pr0s_3151989.trc:

ORA-00600: internal error code, arguments: [3020], [2], [16431], [8405039], [], [], [], [], [], [], [], []

ORA-10567: Redo is inconsistent with data block (file# 2, block# 16431, file offset is 134602752 bytes)
ORA-10564: tablespace SYSAUX

ORA-01110: data file 2: '+DATA01/myprod_hostname_129/datafile/sysaux.256.914736089'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 6478
Errors in file /app/ora/local/admin/MYPROD/diag/rdbms/myprod_hostname_129/MYPROD/trace/MYPROD_mrp0_3151683.trc  (incident=17881):

Login to Primary database and perform the backup of datafile, Here we will backup the datafile and restore the datafile to standby database.

RMAN> backup format '/db/dump01/backup_stdby/sysaux.256.914736089' datafile 2 ;


Starting backup at 19-AUG-17
using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=156 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=+DATA01/myprod_hostname_129/datafile/sysaux.257.914670317
channel ORA_DISK_1: starting piece 1 at 19-AUG-17
channel ORA_DISK_1: finished piece 1 at 19-AUG-17
piece handle=/db/files/backup_stdby/sysaux.256.914736089 tag=TAG20170219T103456 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07

Finished backup at 19-AUG-17
Starting Control File and SPFILE Autobackup at 19-AUG-17

piece handle=/app/ora/local/admin/MYPROD/files/PRIMARY_MYPROD_c-218898855-20170219-01.ctl comment=NONE

Finished Control File and SPFILE Autobackup at 19-AUG-17

RMAN> exit

Now transfer the backup piece to standby server and perform the recovery :

Once the files are copied to standby server, Login to Standby database and start the restore of datafile to remediate the issue.

Catalog the backup piece using rman on standby database.

hostname_ standby01 (oracle):MYPROD:backup_stdby$ rman target /

RMAN> catalog start with '/db/files/backup_stdby' ;

using target database control file instead of recovery catalog

searching for all files that match the pattern /db/files/backup_stdby

List of Files Unknown to the Database

=====================================

File Name: /db/files/backup_stdby/sysaux.256.914736089

Do you really want to catalog the above files (enter YES or NO)? YES

cataloging files...
cataloging done

List of Cataloged Files

=======================

File Name: /db/files/backup_stdby/sysaux.256.914736089

RMAN> exit



SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> shut immediate ;

ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> startup mount;

ORACLE instance started.

Total System Global Area 1068937216 bytes

Fixed Size                  2235208 bytes
Variable Size             494929080 bytes
Database Buffers          566231040 bytes
Redo Buffers                5541888 bytes
Database mounted.

SQL: hostname_ standby01:(MYPRD):PHYSICAL STANDBY> !rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Sun AUG 19 10:46:20 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: MYPROD(DBID=218895632, not open)

RMAN> restore datafile 2 ;

Starting restore at 19-AUG-17
using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=78 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00002 to +DATA01/myprod_files/datafile/sysaux.256.914736089
channel ORA_DISK_1: reading from backup piece /db/dump01/backup_stdby/sysaux.256.914736089
channel ORA_DISK_1: piece handle=/db/files/backup_stdby/sysaux.256.914736089 tag=TAG20170219T103456
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 19-AUG-17



RMAN> exit

Recovery Manager complete.

Once the restore via RMAN is completed. Bounce the MRP and check the behaviour.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> alter database recover managed standby database cancel ;

Database altered.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

Database altered.

Check if MRP is running now. All looks good.. ! J



SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY>  !ps -ef|grep mrp

oracle   3500966       1  0 10:47 ?        00:00:00 ora_mrp0_MYPROD

oracle   3501928 3456846  0 10:48 pts/10   00:00:00 /bin/ksh -c ps -ef|grep mrp

oracle   3501930 3501928  0 10:48 pts/10   00:00:00 grep mrp

Check if lag is reducing and is in Sync with Primary database:

SQL: hostname_ standby01:( MYPROD):PRIMARY> archive log list ;

Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /app/ora/local/admin/myprod/arch1
Oldest online log sequence     18233
Next log sequence to archive   18235                
Current log sequence           18235                


SQL: hostname_ standby01:( MYPROD):PHYSICAL STANDBY>

SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APPL.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"

FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME )

IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ;



SQL:xstm6551bor:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18215         19



SQL:hostname_standby01:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18228          6



SQL:hostname_standby01:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18234          0



SQL: hostname_standby01:( MYPROD):PHYSICAL STANDBY>
SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY>

The Standby is in Sync with Primary database now.

"Do something (anything). If you don't do anything, you won't get anywhere. Make it your hobby, not a chore, but above all have fun!"

Saturday 20 August 2016

How to recover from the Loss of a Datafile on a Standby Database ?

Use these following steps to recover if you lost a datafile on a standby database :

1. Initially Stop the Redo Apply using the ALTER DATABASE command:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

2. Now , Start RMAN and connect both to the standby and recovery catalog using:

RMAN TARGET / CATALOG rcat/passwd@RCAT

3. Issue the following commands to restore and recover datafiles on the standby database:

RMAN> RESTORE DATAFILE 5;

RMAN> RECOVER DATAFILE 5;

4. Now, Restart the SQL Apply :

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

How to Prepare for DBA Interview

General Tips to Prepare for an Oracle DBA Job Interview

For more updates, Like Us Our Facebook Page Here

Tuesday 2 August 2016

How to Recover archive gaps in standby database - using 2 methods

Using Both Methods

1. Manually Log Shipping (when the missing logs are very less)

2. Performing Incremental Backup (When there is a very huge gap)

METHOD 1:

When the logs missing or corrupt is less in number (say below 15), we can ship the logs which were missing in the standby site from the primary site (scp/sftp/ftp) and then we can register the log file in the standby so that the gap can be resolved.

This is easy process if you have missing or corrupt logs in lesser number.

Otherwise we can use the incremental backup strategy, and perform the recovery at standby site.

Lets go through the Archive log Shipping process

First, Find the archives which are missing by issuing the following command. This would give the gap sequences

SQL> select * from v$archive_gap

Or you can use the v$managed_standby view to find where the log apply stuck.

SQL> select sequence#,process,status from v$managed_standby;

Now, Copy the logs to the standby site from the primary site

Using the below command

$ scp log_file_name_n.arc oracle@standby:/log/file/location/log_file_name_n.arc

At standby site, Do the log file registration at the standby site until all the missing log files are registered, Use this below command.

SQL> alter database register logfile '/log/file/location/log_file_name_n.arc';

Now apply would take place and your standby will become sync with the primary.

METHOD 2 :

when the difference is huge (say around 500 logs) the above method is very time consuming and not a proper approach. Else you have to rebuild the standby database from scratch.

As an enhancement from 10g, an incremental backup created with BACKUP INCREMENTAL... FROM SCN can be used to refresh the standby database with changes at the primary database since the last SCN at Standby and then managed recovery can resume.

Step 1:

Use this below command to find the SCN difference, on both the database – Primary DB & Standby DB

SQL> select current_scn from v$database;

Step 2 :

Stop the managed standby apply process:

SQL> alter database recover managed standby database cancel;

Step 3:

Now Shutdown the standby database

SQL> shut immediate

Step 4:

On the primary, take an incremental backup from the SCN number where the standby has been stuck:

RMAN> run {

allocate channel c1 type disk format '/u01/backup/%U.bkp';

backup incremental from scn ********* database;

}

Step 5: On the primary, create a new standby controlfile and copy this file to standby side:

SQL> alter database create standby controlfile as '/u01/backup/for_standby.ctl';

$ scp * oracle@dataguard : /u01/backup

Step 6 :

Bring up the Standby instance in nomount mode:

SQL> startup nomount

Step 7

Now, replace the previous controlfile with this new one created at primary , and Bring the database to MOUNT state.

(Because, this Primary Side control file has the information of the SCN and we have to perform the recovery using this controlfile)

SQL> alter database mount standby database;

Step 8 :

Open the RMAN prompt and Catalog the backup piece.

(Because , RMAN does not know about these files yet; so you must let it know – by a process called cataloging)

$ rman target=/

RMAN> catalog start with '/u01/backup';

Step 9 :

Recover these files:

RMAN> recover database;

Step 10 :

After performing the recovery , exit RMAN and start managed recovery process:

SQL> alter database recover managed standby database disconnect from session;

Step 11 :

Again Check the SCN’s in primary and standby to make sure that both are in SYNc: