Sunday 12 August 2018

ORA-00600: internal error code, arguments [kfgpCreate_60] | ASM disks


ERROR at line 1:
ORA-00600: internal error code, arguments: [kfgpCreate_60], [10], [2], [65535], [65535], [65535], [65535], [], [], [], [], []

This will issue appear while dropping disk in ASM.
As per MOS it’s a bug (metalink doc  2031394.1). Here is the post how I solved it


Alternate way


SQL> alter diskgroup <diskgroup_name> set attribute 'appliance.mode'='FALSE';

SQL> Run <DROP the disk here>


For example:

SQL> alter diskgroup RECO_myexa set attribute 'appliance.mode'='FALSE';

Diskgroup altered.

SQL> alter diskgroup DISK42 drop disk DISK_myexad_DR_09_diskexecel01  force;

Diskgroup altered.





ORA 00600 For standby database - Check here.. !!

ORA-609 : opiodr aborting process unknown ospid



As a general error, the ORA-609 error indicates that a client connection failed to complete.  This can be an ORA-609 from an abort or killing an Oracle session.

To diagnose any error, you start by using the OERR UTILITY to display the ORA-609 error:

Example :

$ oerr ora 609
00609, 00000, "could not attach to incoming connection"
// *Cause:  Oracle process could not answer incoming connection
// *Action: If the situation described in the next error on the stack
// can be corrected, do so; otherwise contact Oracle Support.


Cause:

The ORA-609 error is thrown when a client connection of any kind failed to complete or aborted the connection
process before the server process was completely spawned.
Beginning with 10gR2, a default value for inbound connect timeout has been set at 60 seconds.

This is also triggered, when a DB session is killed/aborted manually from the OS prompt.

Solution:
Increase the values for INBOUND_CONNECT_TIMEOUT at both listener and server side sqlnet.ora file as a preventive measure.
If the problem  is due to connection timeouts,an increase in the following parameters should eliminate or reduce the occurrence of the ORA-609s.
Sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT=180
Listener.ora: INBOUND_CONNECT_TIMEOUT_listener_name=120


Reference metalink Doc ID 1121357.1


MRP terminated with ORA-00600: internal error code, arguments: [3020] | For Standby database


Today, One of the database was having lag an MRP process was terminating with Internal errors ORA 600   arguments: [3020]

Here, Checked the standby database, the gap was increasing rapidly.

SQL:hostname_standby01:(MYPROD):PHYSICAL STANDBY> SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APP                  L.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"
  2  FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME)                   FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME )
  3  IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ;

 

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18223                 17969        254

 

SQL:hostname_standby01:(MYPROD):PHYSICAL STANDBY>



I was curious to check the alert log to check and know what went wrong and why MRP process keeps on terminating. So, I went through the alert log and I found below details.


hostname_standby01(oracle):MYPROD:trace$ tail -400f alert_MYPROD.log



Errors in file /app/ora/local/admin/MYPROD/diag/rdbms/myprod_hostname_129/MYPROD/trace/MYPROD_pr0s_3151989.trc:

ORA-00600: internal error code, arguments: [3020], [2], [16431], [8405039], [], [], [], [], [], [], [], []

ORA-10567: Redo is inconsistent with data block (file# 2, block# 16431, file offset is 134602752 bytes)
ORA-10564: tablespace SYSAUX

ORA-01110: data file 2: '+DATA01/myprod_hostname_129/datafile/sysaux.256.914736089'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 6478
Errors in file /app/ora/local/admin/MYPROD/diag/rdbms/myprod_hostname_129/MYPROD/trace/MYPROD_mrp0_3151683.trc  (incident=17881):


Login to Primary database and perform the backup of datafile, Here we will backup the datafile and restore the datafile to standby database.



RMAN> backup format '/db/dump01/backup_stdby/sysaux.256.914736089' datafile 2 ;


Starting backup at 19-AUG-17
using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=156 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=+DATA01/myprod_hostname_129/datafile/sysaux.257.914670317
channel ORA_DISK_1: starting piece 1 at 19-AUG-17
channel ORA_DISK_1: finished piece 1 at 19-AUG-17
piece handle=/db/files/backup_stdby/sysaux.256.914736089 tag=TAG20170219T103456 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07

Finished backup at 19-AUG-17
Starting Control File and SPFILE Autobackup at 19-AUG-17

piece handle=/app/ora/local/admin/MYPROD/files/PRIMARY_MYPROD_c-218898855-20170219-01.ctl comment=NONE

Finished Control File and SPFILE Autobackup at 19-AUG-17

RMAN> exit


Now transfer the backup piece to standby server and perform the recovery :

Once the files are copied to standby server, Login to Standby database and start the restore of datafile to remediate the issue.
Catalog the backup piece using rman on standby database.


hostname_ standby01 (oracle):MYPROD:backup_stdby$ rman target /

RMAN> catalog start with '/db/files/backup_stdby' ;

using target database control file instead of recovery catalog

searching for all files that match the pattern /db/files/backup_stdby

List of Files Unknown to the Database

=====================================

File Name: /db/files/backup_stdby/sysaux.256.914736089

Do you really want to catalog the above files (enter YES or NO)? YES

cataloging files...
cataloging done

List of Cataloged Files

=======================

File Name: /db/files/backup_stdby/sysaux.256.914736089

RMAN> exit



SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> shut immediate ;

ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> startup mount;

ORACLE instance started.

Total System Global Area 1068937216 bytes

Fixed Size                  2235208 bytes
Variable Size             494929080 bytes
Database Buffers          566231040 bytes
Redo Buffers                5541888 bytes
Database mounted.

SQL: hostname_ standby01:(MYPRD):PHYSICAL STANDBY> !rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Sun AUG 19 10:46:20 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: MYPROD(DBID=218895632, not open)

RMAN> restore datafile 2 ;

Starting restore at 19-AUG-17
using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=78 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00002 to +DATA01/myprod_files/datafile/sysaux.256.914736089
channel ORA_DISK_1: reading from backup piece /db/dump01/backup_stdby/sysaux.256.914736089
channel ORA_DISK_1: piece handle=/db/files/backup_stdby/sysaux.256.914736089 tag=TAG20170219T103456
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

Finished restore at 19-AUG-17



RMAN> exit

Recovery Manager complete.


Once the restore via RMAN  is completed. Bounce the MRP and check the behaviour.




SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> alter database recover managed standby database cancel ;

Database altered.

SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

Database altered.

Check if MRP is running now. All looks good.. ! J



SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY>  !ps -ef|grep mrp

oracle   3500966       1  0 10:47 ?        00:00:00 ora_mrp0_MYPROD

oracle   3501928 3456846  0 10:48 pts/10   00:00:00 /bin/ksh -c ps -ef|grep mrp

oracle   3501930 3501928  0 10:48 pts/10   00:00:00 grep mrp


Check if lag is reducing and is in Sync with Primary database:


SQL: hostname_ standby01:( MYPROD):PRIMARY> archive log list ;

Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /app/ora/local/admin/myprod/arch1
Oldest online log sequence     18233
Next log sequence to archive   18235                
Current log sequence           18235                


SQL: hostname_ standby01:( MYPROD):PHYSICAL STANDBY>

SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APPL.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"

FROM (SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH, (SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME )

IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD# ;



SQL:xstm6551bor:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18215         19



SQL:hostname_standby01:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18228          6



SQL:hostname_standby01:( MYPROD):PHYSICAL STANDBY> /

    Thread Last Sequence Received Last Sequence Applied Difference

---------- ---------------------- --------------------- ----------

         1                  18234                 18234          0



SQL: hostname_standby01:( MYPROD):PHYSICAL STANDBY>
SQL: hostname_ standby01:(MYPROD):PHYSICAL STANDBY>




The Standby is in Sync with Primary database now.



"Do something (anything). If you don't do anything, you won't get anywhere. Make it your hobby, not a chore, but above all have fun!"




Sunday 10 June 2018

Troubleshooting Issues with Undo Tablespace


Commonly seen problems with the undo tablespace are of the following nature:
These errors can be caused by many different issues, such as incorrect sizing of the undo tablespace or poorly written SQL or PL/SQL code.

• ORA-01555: snapshot too old
• ORA-30036: unable to extend segment by ... in undo tablespace 'UNDO1'

Causes :


Frequent commits can be the cause of ORA-1555. It's all about read consistency. The time you start a query oracle records a before image. So the result of your query is not altered by DML that takes place in the meantime (your big transaction). The before image uses the rollback segments to get the values of data that is changed after the before image is taken. By committing in your big transaction you tell oracle the rollback data of that transaction can be overwritten. If your query need data from the rollback segments that is overwritten you get this error. The less you commit the less chance you have that the rollback data you need is overwritten. Typically this occurs when users are executing the PL/SQL procedures and code commits inside a cursor.

Actions :


    1. Check if Undo Is Correctly Sized:

The below query checks for issues that have occurred within the last day :

select to_char(begin_time,'MM-DD-YYYY HH24:MI') begin_time
,ssolderrcnt ORA_01555_cnt, nospaceerrcnt no_space_cnt
,txncount max_num_txns, maxquerylen max_query_len
,expiredblks blck_in_expired
from v$undostat where begin_time > sysdate - 1 order by begin_time; 

Output :

BEGIN_TIME           ORA_01555_CNT   NO_SPACE_CNT   MAX_NUM_TXNS   BLCK_IN_EXPIRED
----------------     -------------   ------------   ------------    ---------------
06-10-2018 14:52                 0         0         42              0

02-10-2018 07:24                 0         0          0              0


If this column reports a non-zero value, you need to do one or more of the following tasks:

The most effective way is to “Increase the UNDO_RETENTION initialization parameter”.  

2. Below are the resolutions that can be taken hence forth

         Commit less often, commit at the end only
         Ensure that code does not contain COMMIT statements within cursor loops.
         Re-schedule long-running queries when the system has less DML load or Off-peak hours
         Check the SQL’s that are consuming more undo and try to tune the SQL statement throwing the errors.
         Finally, you may proceed to add extra rollback segments (undo logs) to make more transaction slots available.


NOTE : A maximum of 4 days’ worth of information is stored in the V$UNDOSTAT view. The statistics are gathered every 10 minutes, for a maximum of 576 rows in the table. If you’ve stopped and started your database within the last 4 days, this view will only contain information from the time you last started your database.


The following query displays the current undo size and the recommended size for an undo tablespace with recommended retention in seconds:

select sum(bytes)/1024/1024 cur_mb_size,
dbms_undo_adv.required_undo_size(900) req_mb_size
from dba_data_files
where tablespace_name = (select
value from v$parameter where name = 'undo tablespace');



Output:

CUR_MB_SIZE   REQ_MB_SIZE
-----------   -----------

51200         35840


The output shows that the undo tablespace currently has size of 50GB allocated to it.
In the prior query, you used 900 seconds as the amount of time to retain information in the undo tablespace. To retain undo information for 900 seconds, the Oracle Undo Advisor estimates that the undo tablespace should be around 35G . In this example the undo tablespace is sized adequately. If it were not sized adequately, you would have to either add space to an existing data file or add a data file to the undo tablespace.

Here is perfect query to get the Current undo retention and optimal undo retention  from site Akadia

SELECT d.undo_size/(1024*1024) "ACTUAL UNDO SIZE [MByte]",
       SUBSTR(e.value,1,25) "UNDO RETENTION [Sec]",
       ROUND((d.undo_size / (to_number(f.value) *
       g.undo_block_per_sec))) "OPTIMAL UNDO RETENTION [Sec]"
  FROM ( SELECT SUM(a.bytes) undo_size FROM v$datafile a,
  v$tablespace b, dba_tablespaces c
         WHERE c.contents = 'UNDO' AND c.status = 'ONLINE' AND 
b.name = c.tablespace_name AND a.ts# = b.ts#) d,
v$parameter e, v$parameter f, (
SELECT MAX(undoblks/((end_time-begin_time)*3600*24)) undo_block_per_sec FROM v$undostat) g
WHERE e.name = 'undo_retention' AND f.name = 'db_block_size'
/


Output :

ACTUAL UNDO SIZE [MByte]
------------------------
51200

UNDO RETENTION [Sec]
--------------------
10800

OPTIMAL UNDO RETENTION [Sec]
----------------------------

14580


Find the sessions using view - v$session and v$transaction to get sessions consuming UNDO Segments :


select s.sid, s.serial#, s.osuser, s.logon_time ,s.status, s.machine
,t.used_ublk, t.used_ublk*16384/1024/1024 undo_usage_mb
from v$session s ,v$transaction t where t.addr = s.taddr;




You can use below query using view – v$SQL to get SQL statement associated with a user/session consuming undo space.

select s.sid, s.serial#, s.osuser, s.logon_time, s.status ,s.machine, t.used_ublk ,
t.used_ublk*16384/1024/1024 undo_usage_mb ,q.sql_text from v$session s,
v$transaction t ,v$sql q where t.addr = s.taddr and s.sql_id = q.sql_id;





"Do something (anything).  If you don't do anything, you won't get anywhere. 
Make it your hobby, not a chore, but above all have fun!"  😊