一 环境:
主库:Oracle 双节点RAC 19.6C(Linux)
备库:Oracle 双节点RAC 19.6C(Linux)
二 详细报错:
Errors with log +ORA19CB_ARCH/ORA19CB/ARCHIVELOG/2021_02_07/thread_1_seq_809.315.1063907805
2021-02-16T11:07:34.513934+08:00
Standby Crash Recovery aborted due to error 16016.
2021-02-16T11:07:34.621378+08:00
Errors in file /oracle/app/oracle/diag/rdbms/ora19cb/ora19cb1/trace/ora19cb1_ora_9831.trc:
ORA-16016: archived log for thread 2 sequence# 809 unavailable
2021-02-16T11:07:37.026592+08:00
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
stopping change tracking
2021-02-16T11:07:38.154299+08:00
Completed Standby Crash Recovery.
2021-02-16T11:07:44.616632+08:00
Errors in file /oracle/app/oracle/diag/rdbms/ora19cb/ora19cb1/trace/ora19cb1_ora_9831.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '+ORA19CB_DATA/ORA19CB/DATAFILE/system.262.1046531145'
ORA-10458 signalled during: ALTER DATABASE OPEN /* db agent *//* {1:27996:2} */...
2021-02-16T11:07:47.129461+08:00
故障原因:
由于测试环境主机频繁重启,导致数据不一致(猜测)
处理方法:
分析原因后,找了好多处理思路,都是建议重新duplicate,但Oracle ADG 归档丢失后备库无法应用的场景下有一种处理方法,我们通过此思路继续一步一步处理,首先在备库查询当前的SCN 然后在主库基于SCN 进行备份数据和控制文件,在备库进行恢复,处理报错:
1)主库通过Broker 停止日志传输和日志应用:
edit database 'ora19cb' set STATE='APPLY-OFF' ;
edit database 'ora19c' set STATE='TRANSPORT-OFF' ;
2)备库确认SCN:
SELECT CURRENT_SCN FROM V$DATABASE;
select min(checkpoint_change#) from v$datafile_header
3)主库备份:
BACKUP INCREMENTAL FROM SCN 3162298 DATABASE FORMAT '/tmp/ForStandby_%U' tag 'FORSTANDBY';
BACKUP CURRENT CONTROLFILE FOR STANDBY FORMAT '/tmp/ForStandbyCTRL.bck';
4) 备库恢复:
RMAN> SHUTDOWN IMMEDIATE ;
RMAN> STARTUP NOMOUNT;
RMAN> RESTORE STANDBY CONTROLFILE FROM '/tmp/ForStandbyCTRL.bck';
RMAN> ALTER DATABASE MOUNT;
RMAN> RESTORE STANDBY CONTROLFILE FROM '/oracle/software/ForStandbyCTRL.bck';
Starting restore at 2021-02-20 16:04:15
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=65 instance=ora19cb1 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
output file name=+ORA19CB_DATA/ORA19CB/CONTROLFILE/current.265.1046531113
Finished restore at 2021-02-20 16:04:17
RMAN> ALTER DATABASE MOUNT;
released channel: ORA_DISK_1
Statement processed
RMAN> list incarnation;
List of Database Incarnations
DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1 1 ORA19C 1068822584 PARENT 1 2019-04-17 00:55:59
2 2 ORA19C 1068822584 CURRENT 1920977 2019-08-07 00:42:40
此时后台会有大量的文件rename 告警:
*****************************************
WARNING: The converted filename '+ORA19CB_DATA/ora19c/aaf3729430b47162e0531e38a8c01874/datafile/ts_swing.294.1046413467'
is an ASM fully qualified filename.
Changing the filename to '+ORA19CB_DATA/MUST_RENAME_THIS_DATAFILE_20.4294967295.4294967295'.
Please rename it accordingly.
*****************************************
2020-07-23T15:05:44.180520+08:00
*****************************************
WARNING: The converted filename '+ORA19CB_DATA/ora19c/datafile/system.258.1015634281'
is an ASM fully qualified filename.
Changing the filename to '+ORA19CB_DATA/MUST_RENAME_THIS_DATAFILE_1.4294967295.4294967295'.
Please rename it accordingly.
*****************************************
继续
RMAN> SWITCH DATABASE TO COPY;
RMAN> SWITCH DATABASE TO COPY;
datafile 1 switched to datafile copy "+ORA19CB_DATA/ORA19CB/DATAFILE/system.262.1046531145"
datafile 3 switched to datafile copy "+ORA19CB_DATA/ORA19CB/DATAFILE/sysaux.261.1046531261"
datafile 4 switched to datafile copy "+ORA19CB_DATA/ORA19CB/DATAFILE/undotbs1.260.1046531357"
datafile 5 switched to datafile copy "+ORA19CB_DATA/ORA19CB/8F8000E513704E63E0531E38A8C0B3E8/DATAFILE/system.259.1046531373"
datafile 6 switched to datafile copy "+ORA19CB_DATA/ORA19CB/8F8000E513704E63E0531E38A8C0B3E8/DATAFILE/sysaux.257.1046531389"
datafile 7 switched to datafile copy "+ORA19CB_DATA/ORA19CB/DATAFILE/users.271.1046531405"
datafile 8 switched to datafile copy "+ORA19CB_DATA/ORA19CB/8F8000E513704E63E0531E38A8C0B3E8/DATAFILE/undotbs1.270.1046531407"
datafile 9 switched to datafile copy "+ORA19CB_DATA/ORA19CB/DATAFILE/undotbs2.269.1046531417"
datafile 24 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F914FF6E0531E38A8C0072D/DATAFILE/system.284.1063557913"
datafile 25 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F914FF6E0531E38A8C0072D/DATAFILE/sysaux.263.1063557935"
datafile 26 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F914FF6E0531E38A8C0072D/DATAFILE/undotbs1.285.1063557951"
datafile 27 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F944FF6E0531E38A8C0072D/DATAFILE/system.286.1063557971"
datafile 28 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F944FF6E0531E38A8C0072D/DATAFILE/sysaux.272.1063557981"
datafile 29 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F944FF6E0531E38A8C0072D/DATAFILE/undotbs1.288.1063557993"
datafile 30 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F914FF6E0531E38A8C0072D/DATAFILE/undo_2.289.1063557997"
datafile 31 switched to datafile copy "+ORA19CB_DATA/ORA19CB/B359E4EB6F944FF6E0531E38A8C0072D/DATAFILE/undo_2.290.1063558003
如果有报错可以使用 switch datafile file# to copy;
5) 处理报错
数据库open 后台还有大量的 WARNING,主要是redo 的路径和temp。我通过adg 备库调好redo 的方法把redo 和temp 替换了:
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter system set standby_file_management=manual;
System altered.
SQL> alter database open;
Database altered.
SQL> alter database drop logfile group 2;
Database altered.
SQL> alter database drop logfile group 3;
Database altered.
SQL> alter database drop logfile group 4;
Database altered.
SQL>
SQL>
SQL>
SQL> alter database add logfile thread 1 group 2 ('+ORA19CB_DATA') size 209715200;
alter database add logfile thread 2 group 3 ('+ORA19CB_DATA') size 209715200;
alter database add logfile thread 2 group 4 ('+ORA19CB_DATA') size 209715200;
Database altered.
SQL>
Database altered.
SQL>
Database altered.
SQL> select name from v$tempfile;
NAME
----------------------------------------------------------------------------------------------------------------
+ORA19CB_DATA/ORA19CB/TEMPFILE/temp.304.1065025445
+ORA19CB_DATA/ORA19CB/8F8000E513704E63E0531E38A8C0B3E8/TEMPFILE/temp.305.1065025455
+ORA19CB_DATA/MUST_RENAME_THIS_TEMPFILE_3.4294967295.4294967295
+ORA19CB_DATA/MUST_RENAME_THIS_TEMPFILE_4.4294967295.4294967295
SQL> alter tablespace temp add tempfile '+ORA19CB_DATA' size 50M;
Tablespace altered.
SQL> alter tablespace temp drop tempfile 3;
Tablespace altered.
SQL> conn / as sysdba
Connected.
SQL> show pdbs;
CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDBTS1 READ ONLY NO
4 PDBTS2 READ ONLY NO
SQL> alter session set container=&PDBNAME ;
Enter value for pdbname: PDBTS2
old 1: alter session set container=&PDBNAME
new 1: alter session set container=PDBTS2
Session altered.
SQL> alter tablespace temp add tempfile '+ORA19CB_DATA' size 50M;
Tablespace altered.
SQL> alter tablespace temp drop tempfile 4;
Tablespace altered.
SQL> select name from v$tempfile;
NAME
--------------------------------------------------------------------------------
+ORA19CB_DATA/ORA19CB/B359E4EB6F944FF6E0531E38A8C0072D/TEMPFILE/temp.309.1065026937
6) 至此数据恢复完成,继续ADG 关系恢复:
备库:
SQL> alter system set standby_file_management=auto;
System altered.
主库:
edit database 'ora19cb' set STATE='APPLY-ON' ;
edit database 'ora19c' set STATE='TRANSPORT-ON' ;
oracle@ora19c1:[/home/oracle]$dgmgrl /
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Sat Feb 20 16:15:14 2021
Version 19.7.0.0.0
Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected to "ora19c"
Connected as SYSDG.
DGMGRL> edit database 'ora19cb' set STATE='APPLY-ON' ;
Succeeded.
DGMGRL> show database verbose ora19cb;
Database - ora19cb
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: (unknown)
Apply Lag: (unknown)
Average Apply Rate: (unknown)
Active Apply Rate: (unknown)
Maximum Apply Rate: (unknown)
Real Time Query: OFF
Instance(s):
ora19cb1 (apply instance)
ora19cb2
Properties:
DGConnectIdentifier = 'yydbb_s'
ObserverConnectIdentifier = ''
FastStartFailoverTarget = ''
PreferredObserverHosts = ''
LogShipping = 'ON'
RedoRoutes = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyLagThreshold = '30'
TransportLagThreshold = '30'
TransportDisconnectedThreshold = '30'
ApplyParallel = 'AUTO'
ApplyInstances = '0'
StandbyFileManagement = ''
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '0'
LogArchiveMinSucceedDest = '0'
DataGuardSyncLatency = '0'
LogArchiveTrace = '0'
LogArchiveFormat = ''
DbFileNameConvert = ''
LogFileNameConvert = ''
ArchiveLocation = ''
AlternateLocation = ''
StandbyArchiveLocation = ''
StandbyAlternateLocation = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
LogXptStatus = '(monitor)'
SendQEntries = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
StaticConnectIdentifier(*)
TopWaitEvents(*)
SidName(*)
(*) - Please check specific instance for the property value
Log file locations(*):
(*) - Check specific instance for log file locations.
Database Status:
SUCCESS
DGMGRL> show database verbose ora19c;
Database - ora19c
Role: PRIMARY
Intended State: TRANSPORT-OFF
Instance(s):
ora19c1
Error: ORA-16739: redo transport service for member "ora19cb" is running
ora19c2
Error: ORA-16739: redo transport service for member "ora19cb" is running
Properties:
DGConnectIdentifier = 'yydb_p'
ObserverConnectIdentifier = ''
FastStartFailoverTarget = ''
PreferredObserverHosts = ''
LogShipping = 'ON'
RedoRoutes = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyLagThreshold = '30'
TransportLagThreshold = '30'
TransportDisconnectedThreshold = '30'
ApplyParallel = 'AUTO'
ApplyInstances = '0'
StandbyFileManagement = ''
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '0'
LogArchiveMinSucceedDest = '0'
DataGuardSyncLatency = '0'
LogArchiveTrace = '0'
LogArchiveFormat = ''
DbFileNameConvert = ''
LogFileNameConvert = ''
ArchiveLocation = ''
AlternateLocation = ''
StandbyArchiveLocation = ''
StandbyAlternateLocation = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
LogXptStatus = '(monitor)'
SendQEntries = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
StaticConnectIdentifier(*)
TopWaitEvents(*)
SidName(*)
(*) - Please check specific instance for the property value
Log file locations(*):
(*) - Check specific instance for log file locations.
Database Status:
ERROR
DGMGRL> edit database 'ora19c' set STATE='TRANSPORT-ON' ;
Succeeded.
DGMGRL> show database verbose ora19c;
Database - ora19c
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
ora19c1
ora19c2
Properties:
DGConnectIdentifier = 'yydb_p'
ObserverConnectIdentifier = ''
FastStartFailoverTarget = ''
PreferredObserverHosts = ''
LogShipping = 'ON'
RedoRoutes = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyLagThreshold = '30'
TransportLagThreshold = '30'
TransportDisconnectedThreshold = '30'
ApplyParallel = 'AUTO'
ApplyInstances = '0'
StandbyFileManagement = ''
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '0'
LogArchiveMinSucceedDest = '0'
DataGuardSyncLatency = '0'
LogArchiveTrace = '0'
LogArchiveFormat = ''
DbFileNameConvert = ''
LogFileNameConvert = ''
ArchiveLocation = ''
AlternateLocation = ''
StandbyArchiveLocation = ''
StandbyAlternateLocation = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
LogXptStatus = '(monitor)'
SendQEntries = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
StaticConnectIdentifier(*)
TopWaitEvents(*)
SidName(*)
(*) - Please check specific instance for the property value
Log file locations(*):
(*) - Check specific instance for log file locations.
Database Status:
SUCCESS
7) 重启备库集群,查看数据库正常传输应用,至此问题解决!