借助SCN的变化来理解oracle备份与恢复的基本原理SCN是oracle挂在墙上的时钟。早上起床,曰起床SCN;吃早餐,名早餐SCN;出门上班,称之为出门SCN。我们的任何活动,都会对应一个SCN。我们可借助oracle内部的一个...SyntaxHighlighter.all
借助SCN的变化来理解oracle备份与恢复的基本原理
SCN是oracle挂在墙上的时钟。早上起床,曰“起床SCN”;吃早餐,名“早餐SCN”;出门上班,称之为“出门SCN”。我们的任何活动,都会对应一个SCN。我们可借助oracle内部的一个包来获取系统的SCN(注意:这里只是系统的scn,因为,oracle还有commit scn,checkpoint scn,select scn等等)。
[sql]
SQL> select dbms_flashback.get_system_change_number "system's scn" from dual;
www.2cto.com
system's scn
------------
555956
oracle内部只有一个SCN,其他的都是来自它。我们还可以看一下
数据库里面最小的SCN。
[sql]
SQL> select creation_change# "oracle内部最小scn" from v$datafile where file#=1;
oracle内部最小scn
-----------------
9
我们加在oracle身上的事,无论好坏,oracle都会依据SCN,一一记在心里(日志),莫敢相忘。由于SCN是递增的,我们对应到相关的SCN,就能找到那个时刻,我们对oracle所做的事。这便是SCN的重要性。
我们对oracle所在的事,她会记在当前日志组。我们可以用v$log来查询。
[sql]
SQL> select group#,sequence#,status from v$log;
GROUP# SEQUENCE# STATUS
---------- ---------- ----------------
1 5 CURRENT
2 3 INACTIVE
3 4 INACTIVE
接下来,我们对oracle做件事。我们建个表t,有两个字段。其中,字段scn可以约等于事务开始的scn。
[sql]
SQL> create table t(id int,scn number) tablespace users;
Table created.
SQL> insert into t values(1,dbms_flashback.get_system_change_number);
www.2cto.com
1 row created.
SQL> commit;
Commit complete.
SQL> select * from t;
ID SCN
---------- ----------
1 585887
我们先把这件事缓缓,来看看v$log里面的first_change#。
[sql]
SQL> alter session set nls_date_format='yyyy/mm/dd hh24:mi:ss';
Session altered.
SQL> select group#,status,first_change#,first_time from v$log;
GROUP# STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------------- ------------- -------------------
1 CURRENT 583374 2012/07/17 19:59:23
2 INACTIVE 560959 2012/07/17 17:13:32
3 INACTIVE 560981 2012/07/17 17:14:33
这里的first_change#和first_time是一样的,都是SCN的两种表现形式。first_change#是日志组成为当前日志组时所取的系统的SCN,来作为这一组最小或者开始的SCN。我们所做的事,对应的SCN,都会比first_change#来得大。
继续我们的事,我们把当前日志组归档。
[sql]
SQL> alter system switch logfile;
System altered.
再瞧瞧v$log里面的first_change#
[sql]
SQL> select group#,status,first_change#,first_time from v$log;
GROUP# STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------------- ------------- -------------------
1 ACTIVE 583374 2012/07/17 19:59:23
2 CURRENT 586090 2012/07/18 09:35:40
3 INACTIVE 560981 2012/07/17 17:14:33
现在当前日志组变成了第2组,first_change#也发生了变化。
再来继续我们未完的事。
[sql]
SQL> insert into t values(2,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> select * from t;
ID SCN
---------- ----------
1 585887
2 586129
从这里我们可以看出,586129比当前日志组2的first_change#(586090)大。从而,证明了first_change#是当前日志组最小的SCN,之后,我们所做的任何事,产生的SCN,都会比这个来得大。
我们再日志却,将日志组2归档。
[sql]
SQL> alter system switch logfile;
www.2cto.com
System altered.
SQL> select group#,status,first_change#,first_time from v$log;
GROUP# STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------------- ------------- -------------------
1 ACTIVE 583374 2012/07/17 19:59:23
2 ACTIVE 586090 2012/07/18 09:35:40
3 CURRENT 586181 2012/07/18 09:39:21
现在,日志组3变成了当前日志组了,相应的first_change#也发生了变化。
再来继续我们事情。为了产生更多的归档日志,我们不断的插入,提交,却换。
[sql]
SQL> insert into t values(3,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> insert into t values (4,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> insert into t values (5,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> insert into t values(6,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> insert into t values (7,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> insert into t values (8,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> select * from t;
ID SCN
---------- ----------
1 585887
2 586129
3 586643
4 586666
5 586692
6 586722
7 586751
8 586805
www.2cto.com
我们再来看一下,当前日志组是哪一组?
[sql]
SQL> select group#,status,first_change#,first_time from v$log;
GROUP# STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------------- ------------- -------------------
1 ACTIVE 586734 2012/07/18 09:45:12
2 ACTIVE 586762 2012/07/18 09:46:15
3 CURRENT 586816 2012/07/18 09:47:15
当前的日志组是3.那么,我们再来插入。
[sql]
SQL> insert into t values(9,dbms_flashback.get_system_change_number);
1 row created.
SQL> commit;
Commit complete.
这时,我们并没有却换日志组。然后,再插入。
[sql]
SQL> insert into t values(10,dbms_flashback.get_system_change_number);
1 row created.
注意了,此时,我们没有提交也没有却换。那么,第9,10条的数据都在日志组3上面。
这里,我们模拟一个实验来阐述备份与恢复的基本原理。
实验:正常关机下,数据文件损坏的完全恢复。
[sql]
[oracle@localhost ~]$ sqlplus /nolog
SQL*Plus: Release 10.2.0.1.0 - Production on Tue Jul 17 20:48:19 2012
Copyright (c) 1982, 2005,
Oracle. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
[oracle@localhost ORCL]$ cd datafile/
[oracle@localhost datafile]$ ls
o1_mf_example_8050jhm7_.dbf o1_mf_temp_8050j34j_.tmp
o1_mf_sysaux_8050fk3w_.dbf o1_mf_undotbs1_8050fkc6_.dbf
o1_mf_system_8050fk2z_.dbf o1_mf_users_8050fkdh_.dbf
[oracle@localhost datafile]$ rm o1_mf_system_8050fk2z_.dbf
[oracle@localhost datafile]$ rm o1_mf_sysaux_8050fk3w_.dbf
[oracle@localhost datafile]$ rm o1_mf_users_8050fkdh_.dbf
[oracle@localhost datafile]$ rm o1_mf_undotbs1_8050fkc6_.dbf
这个时候,假如我们要启动数据库会报什么错呢?
[sql]
[oracle@localhost ~]$ sqlplus /nolog
SQL*Plus: Release 10.2.0.1.0 - Production on Tue Jul 17 20:57:00 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 419430400 bytes
Fixed Size 1219760 bytes
Variable Size 142607184 bytes
Database Buffers 272629760 bytes
Redo Buffers 2973696 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1:
'/u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_8050fk2z_.dbf'
因为,有控制文件,所以,我们会到mount状态,这是个oracle的介态。这个状态,我们可以做很多事。这时,它报文件1不能锁定。那么,我们一个个来。先把冷备的文件1拷来。
[oracle@localhost datafile]$ cp o1_mf_system_8050fk2z_.dbf /u01/app/oracle/oradata/ORCL/datafile
然后,再来打开数据库,看会报什么错?
[sql]
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1:
'/u01/app/oracle/oradata/ORCL/datafile/o1_mf_system_8050fk2z_.dbf'
这个时候报的错误不一样了。报文件1需要媒介恢复。oracle是根据什么报这个错误的呢?要了解这个,我们需要借助两个视图。 www.2cto.com
[sql]
SQL> select file#,checkpoint_change# from v$datafile;
FILE# CHECKPOINT_CHANGE#
---------- ------------------
1 587004
2 587004
3 587004
4 587004
5 587004
SQL> select file#,checkpoint_change# from v$datafile_header;
FILE# CHECKPOINT_CHANGE#
---------- ------------------
1 583375
2 0
3 0
4 0
5 587004
这两个视图所取的信息来源完全不一样。v$datafile的信息来自控制文件;v$datafile_header的信息则来自每个数据文件的文件头。我们刚刚已经把file 1拷回来,所以,oracle可以读到它头上的scn。而2,3,4已经被删了便是读不到的。但是,file 1在两处的scn不一致。记住了,oracle会横向比较,纵向是不会比较的。即:不会拿file 1和file 3比较。oracle打开的必要条件是控制文件和数据文件的文件头的scn要一致。那么大于583375,而小于585469的scn都在归档日志里面。每个scn对应相关的操作。
[sql]
SQL> select sequence#,first_change#,next_change# from v$archived_log;
SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ------------- ------------
5 544404 558719
6 558719 559931
7 559931 560709
8 560709 560959
9 560959 560981
10 560981 583374
什么是next_change#?日志组由当前日志组却换到非当前日志组时,所取的
系统scn,来作为它的最大scn。first_change#是它成为current的开始;而next_change#则是它结束了current生涯的标志。
我们知道,比583375小的scn都已经写入数据文件。现在,我们需要确定583375是落在哪对first_change#和next_change#之间。从而确定广义前滚的起点。
[sql]
SQL> select sequence#,first_change#,next_change# from v$archived_log
2 where 583375>=first_change# and
3 583375<=next_change#;
SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ------------- ------------
11 583374 586090
由此,我们知道,583375落在归档日志11的first_change#和next_change#之间。我们恢复的时候,就从归档日志11开始。那么,我们到底需要多少的归档日志呢?
[sql]
SQL> select sequence#,first_change#,next_change# from v$archived_log
2 where sequence#>=11;
SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ------------- ------------
11 583374 586090
12 586090 586181
13 586181 586656
14 586656 586676
15 586676 586704
16 586704 586734
17 586734 586762
18 586762 586816
从上面可知,如果我们想把数据全部找回,我们需要借助到归档日志18.我们看一下这些first_change#和next_change#有什么特色?
归档日志11的next_change#是归档日志12的first_change#。以此类推,所以,这么多的归档日志,其实,逻辑上就只是一个归档日志。因此,归档日志必须连续!如果,你归档日志13坏了,那么只能恢复到12的next_change#。后面再多的归档也是徒然。
接下来,我们开始恢复。 www.2cto.com
[sql]
SQL> recover datafile 1;
ORA-00279: change 583375 generated at 07/17/2012 19:59:23 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_11_%u_.ar
c
ORA-00280: change 583375 for thread 1 is in sequence #11
Specify log: {=suggested | filename | AUTO | CANCEL}
oracle告诉我们,583375 对于实例是需要的。并且,归档日志11在闪回区。如果敲回车,则采纳oracle的建议,oracle会自己到闪回区里面去找。我们敲一下回车键采纳oracle的建议。第二个选项,是不在默认路径里面,由你来告诉oracle,归档日志身在何处。你只要告诉oracle,归档日志的绝对路径+名称,就可以了。第三个选项,如果归档日志很多,一个个挨着去找,显得很麻烦,那么我们就去auto。第四个选项,如果恢复到一半,或者,没有了归档日志,那么你可以敲cancel。
[sql]
ORA-00279: change 586090 generated at 07/18/2012 09:35:40 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_12_%u_.ar
c
ORA-00280: change 586090 for thread 1 is in sequence #12
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_11_80d4q
dmh_.arc&#39; no longer needed for this recovery
Specify log: {=suggested | filename | AUTO | CANCEL}
这时,oracle会再告诉我们,日志12是实例所需要的。我们先把这事给搁着。先去数据文件文件头,把scn给取出来瞧瞧。
[sql]
SQL> select file#,checkpoint_change# from v$datafile_header;
FILE# CHECKPOINT_CHANGE#
---------- ------------------
1 586090
2 0
3 0
4 0
5 587004
发现没?file 1 的scn变成了586090。而586090是归档日志12的first_change#。难怪oracle告诉我们日志12是实例必须的。接下来,我们敲auto。
[sql]
Specify log: {=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 586181 generated at 07/18/2012 09:39:21 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_13_%u_.ar
c
ORA-00280: change 586181 for thread 1 is in sequence #13
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_12_80d4y
9fp_.arc&#39; no longer needed for this recovery
ORA-00279: change 586656 generated at 07/18/2012 09:42:06 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_14_%u_.ar
c
ORA-00280: change 586656 for thread 1 is in sequence #14
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_13_80d53
g42_.arc&#39; no longer needed for this recovery
ORA-00279: change 586676 generated at 07/18/2012 09:42:59 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_15_%u_.ar
c
ORA-00280: change 586676 for thread 1 is in sequence #15
ORA-00278: log file www.2cto.com
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_14_80d55
41h_.arc&#39; no longer needed for this recovery
ORA-00279: change 586704 generated at 07/18/2012 09:44:03 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_16_%u_.ar
c
ORA-00280: change 586704 for thread 1 is in sequence #16
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_15_80d57
315_.arc&#39; no longer needed for this recovery
Log applied.
Media recovery complete.
发现没?oracle只运用了归档日志到16。接下来的17,18就没有再提示了。为什么?我们先去数据文件的文件头把file 1的scn再取来看看。
[sql]
SQL> select file#,checkpoint_change# from v$datafile_header;
FILE# CHECKPOINT_CHANGE#
---------- ------------------
1 587003
2 0
3 0
4 0
5 587004
587003是不是比归档日志18的next_change#(586816)来得大呢。我们再来看看,当前日子组是哪一组。
[sql]
SQL> select group#,sequence#,status,first_change# from v$log;
GROUP# SEQUENCE# STATUS FIRST_CHANGE#
---------- ---------- ---------------- -------------
1 17 INACTIVE 586734
3 19 CURRENT 586816
2 18 INACTIVE 586762
可以看出,归档日志19的first_change#为586816。而数据文件头的scn是587003。当我们敲recover datafile 1时,oracle在做完全恢复。完全恢复的起点和终点是已经确定了。起点在数据文件的文件头,终点在控制文件里获取。因为,归档重做日志文件17,18是从联机重做日志文件1,2里面读出来的。oracle会优先去找联机重做日志文件。或者说,完全恢复时,oracle会自己去找联机重做日志文件;不完全恢复,我们可以把online redo log file的绝对路径和名称输进去。当前日志组是3,它的first_change#为586816,而587003比这个数大。可见,oracle也将当前日志文件给用上了。
接着恢复。这次,我们把剩余的数据文件全部拷回。然后大家一起往前走,直到步伐一致时,才能够同时停下来,这样子,oracle就处于一致的状态了。
[sql]
SQL> recover database;
ORA-00279: change 583375 generated at 07/17/2012 19:59:23 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_11_%u_.ar
c
ORA-00280: change 583375 for thread 1 is in sequence #11
Specify log: {=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 586090 generated at 07/18/2012 09:35:40 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_12_%u_.ar
c
ORA-00280: change 586090 for thread 1 is in sequence #12
ORA-00278: log file www.2cto.com
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_11_80d4q
dmh_.arc&#39; no longer needed for this recovery
ORA-00279: change 586181 generated at 07/18/2012 09:39:21 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_13_%u_.ar
c
ORA-00280: change 586181 for thread 1 is in sequence #13
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_12_80d4y
9fp_.arc&#39; no longer needed for this recovery
ORA-00279: change 586656 generated at 07/18/2012 09:42:06 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_14_%u_.ar
c
ORA-00280: change 586656 for thread 1 is in sequence #14
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_13_80d53
g42_.arc&#39; no longer needed for this recovery
ORA-00279: change 586676 generated at 07/18/2012 09:42:59 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_15_%u_.ar
c
ORA-00280: change 586676 for thread 1 is in sequence #15
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_14_80d55
41h_.arc&#39; no longer needed for this recovery
ORA-00279: change 586704 generated at 07/18/2012 09:44:03 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_16_%u_.ar
c
ORA-00280: change 586704 for thread 1 is in sequence #16
ORA-00278: log file
&#39;/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2012_07_18/o1_mf_1_15_80d57
315_.arc&#39; no longer needed for this recovery
www.2cto.com
Log applied.
Media recovery complete.
接下来,我们就可以打开数据库了。
[sql]
SQL> alter database open;
Database altered.
作者 linwaterbin