Learn DBA : A Life Long Learning Experience: 2018

Tuesday, 2 October 2018

RMAN-08591: WARNING: invalid archived log deletion policy

For one of database we were getting backup failure alerts where the backup was configured on standby database.

We saw below Error messsages in backup-log :

RMAN-08591: WARNING: invalid archived log deletion policy

Cause: If archive log location is set to FRA, then there might be a chance of the deleting the archives automatically when the space pressure in FRA.

In that case at least one of the destination standby must be set as a "MANDATORY" destination.

Solution: To eliminate the RMAN warning message, at least one archive destination must be set as a mandatory destination.

Steps :

1. Log in to the broker command line utility

[oracle@host1 log]$ dgmgrl /
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production
Copyright (c) 2000, 2013, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected as SYSDG.
DGMGRL> show configuration

Configuration – primdb1
  Protection Mode: MaxPerformance
  Members:

  primdb1 - Primary database
    bcpdb1 - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 7 seconds ago)

2. For the standby site check the "Binding" property

DGMGRL> show database verbose bcpdb1

Database - bcpdb1

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      11 minutes (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 625.00 KByte/s
  Active Apply Rate:  523.00 KByte/s
  Maximum Apply Rate: 18.62 MByte/s
  Real Time Query:    ON
  Instance(s):
    bcpdb1

Properties:

----------
----------


Binding        = 'optional'

----------
----------

3. Set the "binding" Property as - MANDATORY

//This parameter controls whether the destination is mandatory or not

DGMGRL> edit database bcpdb1 set property Binding='mandatory';

Property "binding" updated

DGMGRL> exit

Once the broker configuration has set a standby site as a mandatory destination, the RMAN configuration can be altered to set the archivelog deletion policy to applied on standby.

RMAN> configure archivelog deletion policy to applied on standby;

Once the broker configuration has set a standby site as a mandatory destination, then RMAN will not report this error again.

Reference: Data Guard Physical Standby - RMAN configure archivelog deletion policy reports RMAN-08591 (Doc ID 1984064.1)

Found this post interesting? Subscribe us 😊😉

Installation and configure Postgres 13 on RHEL

All about Physical Replication and Log shipping in Postgres

Possible ways to recover space from deleted rows with insufficient disk space

Streaming-Replication Sync and Async, benefits of streaming replication over Log-shipping

“Be like a tree. Stay grounded. Connect with your roots. Turn over a new leaf. Bend before you break. Enjoy your unique beauty. Keep growing.” -- Joanne Rapits.

Query to find the session creating more redo

How to find the session generating more number of Redo or we can say Archivelogs as well.
Using the view v$sess_io, we can find out the sessions which are performing the block_changes in database. From this we can determine the sessions.

Below is the query we can use to find the Session IDs

set pages 1000 lines 1000
SELECT s.sid, s.serial#, s.username, s.program,
i.block_changes
FROM v$session s, v$sess_io i
WHERE s.sid = i.sid
ORDER BY 5 desc;

Tuesday, 14 August 2018

Linux Important Files and Description

/etc/passwd - User Settings

/etc/group - Group settings for users.

/etc/hosts - Host name lookup information

/etc/sysctl.conf - Kernel parameters for Linux.

/etc/fstab - Files to check for File System entries

/etc/oratab - Oracle Registered instance (DBCA)

/etc/redhat- release - get the OS release information

/etc/selinux/config - Enable or disable security feature.

/proc/meminfo - To determine To determine the RAM size

/var/log/messages - Check System and error logs and messages

/home/oracle/.bash_profile - Oracle user profile setting file in Linux.

/etc/security/limits.conf - Specify process and open files related limits

Attack every problem with enthusiasm... as if your survival depends upon it

ASM Interview Questions | PDF Download

How to Crack DBA Interviews | Tips

Frequently Asked Questions for 2+ Years

30+ Oracle Dataguard Questions | PDF Download

Monday, 13 August 2018

Troubleshooting Internal Errors and Error-Look-up Tool on MOS

If you’re an Oracle DBA, you are likely to have come across an error message in your Oracle Database alert.log files prefixed by either ORA-600 or ORA-7445. Let’s see when, how and why does these appears.

Example: One of the ORA 600 could look like as below

ORA 600 [ktfbtgex-7], [1015817] ORA 600 [3600]

ORA 7445 [kewa_dump_time_diff()+157]

The internal error messages include no attached explanation in the way that other error messages do (for example ORA-01017: invalid username/password" while connecting ), it is difficult to assess the seriousness of the error and whether it is cause for concern.

ORA-600 is a message that indicates an error internal to the database code. Oracle Database performs checks to confirm that the information being used in internal processing is healthy, the variables being used for codes are within a valid range, that changes are being made to a consistent structure, and that a change won’t put a structure into an unstable state. If a check fails, Oracle Database signals an ORA-600 error and, if necessary, terminates the operation to protect the health of the database.

The first argument to the ORA-600 error message indicates the location in the code where the check is performed.The subsequent arguments have different meanings, depending on the particular check.

Whereas, An ORA-7445 error, on the other hand, traps a notification the operating system has sent to a process and returns that notification to the user. Unlike the ORA-600 error, the ORA-7445 error is an unexpected failure rather than a handled failure.

Both ORA-600 and ORA-7445 errors will Write the error message to the alert.log, along with details about the location of a trace containing further information.

Assessing and resolving ORA-600 and ORA-7445 errors

ORA-600 and ORA-7445 errors, you can either identify the cause and resolve the error on your own or find ways to avoid the error. The information provided in this section will help you resolve or work around some of the more common errors.

Example 1: ORA-600 [729]. The first argument to this ORA-600 error message, 729, indicates a memory-handling issue. The error message text will always include the words space leak, but the number after 729 will vary:

ORA-00600: internal error code, arguments: [729], [800], [space leak], [], [],

The number in the second set of brackets (800) is the number of bytes of memory discovered. A space leak occurs when some code doesn't completely release the memory it used while executing. In this example, when that process disconnected from the database, it discovered that some memory was not cleaned up at some point during its life and reported ORA-600 [729]. You cannot determine the cause of the space leak by checking your application code, because the error is internal to Oracle Database. OR you can enable the tracing in database to investigate the issue

e.g.

SQL>alter system set events '10262 trace name context forever, level xxxx' scope=spfile;

Note : You will also need to shut the database down and restart it to enable the event to take effect.

Replace xxxx with a number greater than the value in the second set of brackets in the ORA-600 [729] error message. In the example above, you could set the number to 1000, in which case the event instructs the database to ignore all user space leaks that are smaller than 1000 bytes.

Example 2 :

ORA-00600: internal error code, arguments: [3020], [117], [125002], [352446538], [], [], [], [], [], [], [], []

ORA-10567: Redo is inconsistent with data block (file# 117, block# 125002, file offset is 1024016384 bytes)

ORA-10564: tablespace SYSTEM

ORA-01110: data file 02: '+DATA01/myprod_host129 /datafile/sysaux_01.422.914282521'

ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 119838Incident details in: /app/ora/local/admin/myprod/diag/rdbms/myprod_host129/myprod/incident/incdir_63521/myprod_mrp0_2616275_i63521.trc

This error tells that there is issue with datafile #02. It could be corruption on file system or a block corruption on DB level. How I solved this one, Check this post.

Example 3 :

ORA-600 [6033]. This error is reported with no additional arguments, as shown in the following alert.log file describes :

bdump/prod_ora_4345618.trc:

ORA-600: internal error code, arguments: [6033], [], [], [], [], [], [], []

The ORA-600 [6033] error often indicates an index corruption. To identify the affected index, you’ll need to look at the associated trace file whose name is provided in the alert.log file, just above the error message. In this alert.log excerpt, the trace file you need to look at is called prod_ora_4345618.trc which is located in /bdump directory. There are two possible ways to identify the table on which the affected index is built: Look for the SQL statement that was executing at the time of the error. This statement should appear at the top of the trace file, under the heading “Current SQL Statement.” The affected index will belong to one of the tables accessed by that statement.

Example 4 :

ERROR at line 1:

ORA-00600: internal error code, arguments: [kfgpCreate_60], [10], [2], [65535], [65535], [65535], [65535], [], [], [], [], []

This will issue appear while dropping disk in ASM.

As per MOS it’s a bug (metalink doc 2031394.1). Here is the post how I solved it

Oracle Database users with Oracle support introduces support for the ORA-600/ORA-7445 lookup tool Knowledge on this Article 153788.1 it enables you to enter the first argument to an ORA-600 or ORA-7445 error message and use that information to identify known defects, workarounds, and other knowledge targeted specifically to that error/argument combination.

ORA-00600: internal error code, arguments [kfgpCreate_60] | ASM disks

ERROR at line 1:

ORA-00600: internal error code, arguments: [kfgpCreate_60], [10], [2], [65535], [65535], [65535], [65535], [], [], [], [], []

This will issue appear while dropping disk in ASM.

As per MOS it’s a bug (metalink doc 2031394.1). Here is the post how I solved it

Alternate way

SQL> alter diskgroup <diskgroup_name> set attribute 'appliance.mode'='FALSE';

SQL> Run <DROP the disk here>

For example:

SQL> alter diskgroup RECO_myexa set attribute 'appliance.mode'='FALSE';

Diskgroup altered.

SQL> alter diskgroup DISK42 drop disk DISK_myexad_DR_09_diskexecel01 force;

Diskgroup altered.

ORA 00600 For standby database - Check here.. !!

ORA-609 : opiodr aborting process unknown ospid

As a general error, the ORA-609 error indicates that a client connection failed to complete. This can be an ORA-609 from an abort or killing an Oracle session.

To diagnose any error, you start by using the OERR UTILITY to display the ORA-609 error:

Example :

$ oerr ora 609
00609, 00000, "could not attach to incoming connection"
// *Cause: Oracle process could not answer incoming connection
// *Action: If the situation described in the next error on the stack
// can be corrected, do so; otherwise contact Oracle Support.

Cause:

The ORA-609 error is thrown when a client connection of any kind failed to complete or aborted the connection
process before the server process was completely spawned.
Beginning with 10gR2, a default value for inbound connect timeout has been set at 60 seconds.

This is also triggered, when a DB session is killed/aborted manually from the OS prompt.

Solution:
Increase the values for INBOUND_CONNECT_TIMEOUT at both listener and server side sqlnet.ora file as a preventive measure.
If the problem is due to connection timeouts,an increase in the following parameters should eliminate or reduce the occurrence of the ORA-609s.
Sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT=180
Listener.ora: INBOUND_CONNECT_TIMEOUT_listener_name=120

Reference metalink Doc ID 1121357.1

MRP terminated with ORA-00600: internal error code, arguments: [3020] | For Standby database

Today, One of the database was having lag an MRP process was terminating with Internal errors ORA 600 arguments: [3020]

Here, Checked the standby database, the gap was increasing rapidly.

SQL:hostname_standby01:(MYPROD):PHYSICAL STANDBY> SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APP                  L.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"
  2  FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME)                   FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME )
  3  IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ;

 

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18223                 17969        254

 

SQL:hostname_standby01:(MYPROD):PHYSICAL STANDBY>

I was curious to check the alert log to check and know what went wrong and why MRP process keeps on terminating. So, I went through the alert log and I found below details.

hostname_standby01(oracle):MYPROD:trace$ tail -400f alert_MYPROD.log



Errors in file /app/ora/local/admin/MYPROD/diag/rdbms/myprod_hostname_129/MYPROD/trace/MYPROD_pr0s_3151989.trc:

ORA-00600: internal error code, arguments: [3020], [2], [16431], [8405039], [], [], [], [], [], [], [], []

ORA-10567: Redo is inconsistent with data block (file# 2, block# 16431, file offset is 134602752 bytes)
ORA-10564: tablespace SYSAUX

ORA-01110: data file 2: '+DATA01/myprod_hostname_129/datafile/sysaux.256.914736089'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 6478
Errors in file /app/ora/local/admin/MYPROD/diag/rdbms/myprod_hostname_129/MYPROD/trace/MYPROD_mrp0_3151683.trc  (incident=17881):

Login to Primary database and perform the backup of datafile, Here we will backup the datafile and restore the datafile to standby database.

RMAN> backup format '/db/dump01/backup_stdby/sysaux.256.914736089' datafile 2 ;


Starting backup at 19-AUG-17
using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=156 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=+DATA01/myprod_hostname_129/datafile/sysaux.257.914670317
channel ORA_DISK_1: starting piece 1 at 19-AUG-17
channel ORA_DISK_1: finished piece 1 at 19-AUG-17
piece handle=/db/files/backup_stdby/sysaux.256.914736089 tag=TAG20170219T103456 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07

Finished backup at 19-AUG-17
Starting Control File and SPFILE Autobackup at 19-AUG-17

piece handle=/app/ora/local/admin/MYPROD/files/PRIMARY_MYPROD_c-218898855-20170219-01.ctl comment=NONE

Finished Control File and SPFILE Autobackup at 19-AUG-17

RMAN> exit

Now transfer the backup piece to standby server and perform the recovery :

Once the files are copied to standby server, Login to Standby database and start the restore of datafile to remediate the issue.

Catalog the backup piece using rman on standby database.

hostname_ standby01 (oracle):MYPROD:backup_stdby$ rman target /

RMAN> catalog start with '/db/files/backup_stdby' ;

using target database control file instead of recovery catalog

searching for all files that match the pattern /db/files/backup_stdby

List of Files Unknown to the Database

=====================================

File Name: /db/files/backup_stdby/sysaux.256.914736089

Do you really want to catalog the above files (enter YES or NO)? YES

cataloging files...
cataloging done

List of Cataloged Files

=======================

File Name: /db/files/backup_stdby/sysaux.256.914736089

RMAN> exit



SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> shut immediate ;

ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> startup mount;

ORACLE instance started.

Total System Global Area 1068937216 bytes

Fixed Size                  2235208 bytes
Variable Size             494929080 bytes
Database Buffers          566231040 bytes
Redo Buffers                5541888 bytes
Database mounted.

SQL: hostname_ standby01:(MYPRD):PHYSICAL STANDBY> !rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Sun AUG 19 10:46:20 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: MYPROD(DBID=218895632, not open)

RMAN> restore datafile 2 ;

Starting restore at 19-AUG-17
using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=78 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00002 to +DATA01/myprod_files/datafile/sysaux.256.914736089
channel ORA_DISK_1: reading from backup piece /db/dump01/backup_stdby/sysaux.256.914736089
channel ORA_DISK_1: piece handle=/db/files/backup_stdby/sysaux.256.914736089 tag=TAG20170219T103456
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 19-AUG-17



RMAN> exit

Recovery Manager complete.

Once the restore via RMAN is completed. Bounce the MRP and check the behaviour.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> alter database recover managed standby database cancel ;

Database altered.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

Database altered.

Check if MRP is running now. All looks good.. ! J



SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY>  !ps -ef|grep mrp

oracle   3500966       1  0 10:47 ?        00:00:00 ora_mrp0_MYPROD

oracle   3501928 3456846  0 10:48 pts/10   00:00:00 /bin/ksh -c ps -ef|grep mrp

oracle   3501930 3501928  0 10:48 pts/10   00:00:00 grep mrp

Check if lag is reducing and is in Sync with Primary database:

SQL: hostname_ standby01:( MYPROD):PRIMARY> archive log list ;

Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /app/ora/local/admin/myprod/arch1
Oldest online log sequence     18233
Next log sequence to archive   18235                
Current log sequence           18235                


SQL: hostname_ standby01:( MYPROD):PHYSICAL STANDBY>

SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APPL.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"

FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME )

IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ;



SQL:xstm6551bor:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18215         19



SQL:hostname_standby01:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18228          6



SQL:hostname_standby01:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18234          0



SQL: hostname_standby01:( MYPROD):PHYSICAL STANDBY>
SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY>