oracle基于rman实现坏块介质的恢复

文档课题:oracle基于rman实现坏块介质的恢复.
数据库:oracle 11.2.0.4
对于物理损坏的数据块,可以通过RMAN块介质恢复(BLOCK MEDIA RECOVERY)功能来完成受损块的恢复,而不需要恢复整个数据库或所有文件来修复少量受损的数据块.前提是要有一个可用的RMAN备份.以下演示使用rman实现坏块恢复的实验过程.
1、前期准备
1.1、建测试表
SYS@orcl 16:19:19> select * from v$version where rownum<2;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
--创建用于演示的datafile.
SYS@orcl 16:38:29> create tablespace tbs_tmp datafile '/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf' size 10m autoextend on;

Tablespace created.

SYS@orcl 16:40:48> grant select on dba_objects to scott;

Grant succeeded.

SYS@orcl 16:39:17> conn scott/tiger;
Connected.

Session altered.
--基于新的数据文件创建tb_tmp表.
SCOTT@orcl 16:41:53> create table tb_tmp tablespace tbs_tmp as select * from dba_objects;

Table created.
SYS@orcl 16:42:56> col file_name for a60
SYS@orcl 16:43:03> select file_id,file_name from dba_data_files where tablespace_name='TBS_TMP';

FILE_ID FILE_NAME
---------- ------------------------------------------------------------
8 /u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
--查表tb_tmp的信息,如文件信息、头部块、总块数.
SYS@orcl 16:46:38> col segment_name for a15
SYS@orcl 16:46:45> select segment_name,header_file,header_block,blocks from dba_segments where segment_name='TB_TMP' and owner='SCOTT'

SEGMENT_NAME HEADER_FILE HEADER_BLOCK BLOCKS
--------------- ----------- ------------ ----------
TB_TMP 8 130 1408
1.2、备份数据文件
--使用rman备份对应的数据文件.
SYS@orcl 16:47:56> select file#,name from v$datafile;

FILE# NAME
---------- --------------------------------------------------
1 /u01/app/oracle/oradata/orcl/system01.dbf
2 /u01/app/oracle/oradata/orcl/sysaux01.dbf
3 /u01/app/oracle/oradata/orcl/undotbs01.dbf
4 /u01/app/oracle/oradata/orcl/users01.dbf
5 /u01/app/oracle/oradata/orcl/leo_ts01.dbf
6 /u01/app/oracle/oradata/orcl/tcp_acc01.dbf
7 /u01/app/oracle/oradata/orcl/tcp_fin01.dbf
8 /u01/app/oracle/oradata/orcl/tbs_tmp01.dbf

8 rows selected.
[oracle@leo-11g-ogg ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Sun Dec 25 16:47:26 2022

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCL (DBID=1650612057)

RMAN> backup datafile 8 tag=health;

Starting backup at 25-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=11 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00008 name=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
channel ORA_DISK_1: starting piece 1 at 25-DEC-22
channel ORA_DISK_1: finished piece 1 at 25-DEC-22
piece handle=/u01/app/oracle/fast_recovery_area/ORCL/backupset/2022_12_25/o1_mf_nnndf_HEALTH_ktj3mxby_.bkp tag=HEALTH comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 25-DEC-22
2、模拟块损坏以及恢复
2.1、单块损坏及恢复
2.1.1、模拟单块损坏
--使用linux自带的dd命令损坏单块数据块.
[oracle@leo-11g-ogg ~]$ dd of=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf bs=8192 conv=notrunc seek=130 <<EOF
> Corrupted block!
> EOF
0+1 records in
0+1 records out
17 bytes (17 B) copied, 7.07e-05 s, 240 kB/s
--清空buffer cache.
SYS@orcl 16:48:07> alter system flush buffer_cache;

System altered.
说明:此处需要清空缓冲区缓存,否则后续查询会正常显示结果.
--查表tb_tmp,收到ora-01578告警.
SYS@orcl 18:28:14> select count(*) from scott.tb_tmp;
select count(*) from scott.tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 8, block # 130)
ORA-01110: data file 8: '/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf'
说明:如上所示块号130被损坏后,无法显示查询结果.
--查询视图v$database_block_corruption,提示有坏块,注意:该视图可能不会返回任何数据,如无返回,先执行backup validate.
SYS@orcl 18:28:24> select * from v$database_block_corruption;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
8 130 1 0 CORRUPT
2.1.2、单块损坏恢复
--下面使用blockrecover来恢复单块的损坏.
[oracle@leo-11g-ogg orcl]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Sun Dec 25 20:18:39 2022

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to target database: ORCL (DBID=1650612057)

RMAN> blockrecover datafile 8 block 130;

Starting recover at 25-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1523 device type=DISK

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00008
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/ORCL/backupset/2022_12_25/o1_mf_nnndf_HEALTH_ktj3mxby_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/ORCL/backupset/2022_12_25/o1_mf_nnndf_HEALTH_ktj3mxby_.bkp tag=HEALTH
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01

starting media recovery
media recovery complete, elapsed time: 00:00:07

Finished recover at 25-DEC-22
2.1.3、数据验证
--再次查询表scott.tb_tmp.
SYS@orcl 20:07:27> select count(*) from scott.tb_tmp;

COUNT(*)
----------
91336
说明:表scott.tb_tmp成功恢复.
2.2、多块数据块损坏
2.2.1、模拟多块损坏
--下面使用linux dd命令对不连续的多个数据块进行损坏.
[oracle@leo-11g-ogg ~]$ dd of=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf bs=8192 conv=notrunc seek=133 <<EOF
> New corrupted block!
> EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 9.4e-05 s, 223 kB/s
[oracle@leo-11g-ogg ~]$ dd of=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf bs=8192 conv=notrunc seek=143 <<EOF
> New corrupted block!
> EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 6.9671e-05 s, 301 kB/s
[oracle@leo-11g-ogg ~]$ dd of=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf bs=8192 conv=notrunc seek=153 <<EOF
> New corrupted block!
> EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 8.2766e-05 s, 254 kB/s
SYS@orcl 20:27:01> alter system flush buffer_cache;

System altered.
--查询时提示块133被损坏,注意此前损坏过多块数据块,但查询时从块号最小的开始提示,若133号数据块被修复后则提示133之后的坏块.
SYS@orcl 20:27:13> select count(*) from scott.tb_tmp;
select count(*) from scott.tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 8, block # 133)
ORA-01110: data file 8: '/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf'
--查视图v$database_block_corruption无任何记录.
SYS@orcl 20:27:38> select * from v$database_block_corruption;

no rows selected
--使用backup validate来校验数据文件
RMAN> backup validate datafile 8;

Starting backup at 25-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=8 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00008 name=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
8 FAILED 0 199 1664 3363562 --字段status为failed
File Name: /u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 1305
Index 0 0
Other 3 160 --有3个blocks failing

validate found one or more corrupt blocks
See trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2718.trc for details
Finished backup at 25-DEC-22
--再次查询v$database_block_corruption显示3个损坏的块.
SYS@orcl 20:33:09> select * from v$database_block_corruption;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
8 133 1 0 CORRUPT
8 143 1 0 CORRUPT
8 153 1 0 CORRUPT
2.2.2、多块损坏恢复
--下面使用blockrecover corruption list来恢复,如下所示此前被校验的坏块一并被恢复.
RMAN> blockrecover corruption list;

Starting recover at 25-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=8 device type=DISK

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00008
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/ORCL/backupset/2022_12_25/o1_mf_nnndf_HEALTH_ktj3mxby_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/ORCL/backupset/2022_12_25/o1_mf_nnndf_HEALTH_ktj3mxby_.bkp tag=HEALTH
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01

starting media recovery
media recovery complete, elapsed time: 00:00:03

Finished recover at 25-DEC-22
2.2.3、数据验证
SYS@orcl 20:34:56> select count(*) from scott.tb_tmp;

COUNT(*)
----------
91336
3、坏块对象的定位与影响
3.1、查块对应数据库对象
--查询块号为163上的对象.
SYS@orcl 20:41:00> col object_name for a20
SYS@orcl 20:41:08> set line 200
SYS@orcl 20:41:13>
SYS@orcl 20:41:13> r
1 select dbms_rowid.rowid_object(rowid) object_id,
2 dbms_rowid.rowid_relative_fno(rowid) file_id,
3 dbms_rowid.rowid_block_number(rowid) block_id,
4 owner,
5 object_name,
6 object_id
7 from scott.tb_tmp
8 where dbms_rowid.rowid_block_number(rowid) = 163
9* and rownum <= 2

OBJECT_ID FILE_ID BLOCK_ID OWNER OBJECT_NAME OBJECT_ID
---------- ---------- ---------- ------------------------------ -------------------- ----------
96100 8 163 PUBLIC GV$BACKUP_SET 2364
96100 8 163 SYS GV_$BACKUP_PIECE 2365
3.2、损坏数据块
--使用此前的方法损坏163、173数据块.
[oracle@leo-11g-ogg ~]$ dd of=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf bs=8192 conv=notrunc seek=163 <<EOF
> New corrupted block!
> EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 0.000145756 s, 144 kB/s
[oracle@leo-11g-ogg ~]$ dd of=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf bs=8192 conv=notrunc seek=173 <<EOF
> New corrupted block!
> EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 0.000191471 s, 110 kB/s
SYS@orcl 20:44:40> alter system flush buffer_cache;

System altered.
3.3、查询影响
3.3.1、坏块查询
--对于坏块对象无法进行聚合汇总等操作
SYS@orcl 20:46:26> select count(*) from scott.tb_tmp;
select count(*) from scott.tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 8, block # 163)
ORA-01110: data file 8: '/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf'
--使用基于之前查询到的object_id来查询
SYS@orcl 20:48:32> select owner,object_name,object_id from scott.tb_tmp where object_id in(2364,2365);
select owner,object_name,object_id from scott.tb_tmp where object_id in(2364,2365)
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 8, block # 163)
ORA-01110: data file 8: '/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf'
3.3.2、未损坏块查询
--如下面的查询,位于损坏块上的数据无法被查询到,但对于未损坏的依旧可以查询,下面查询块161上的对象
SYS@orcl 20:50:40> select owner,object_name,object_id from scott.tb_tmp where dbms_rowid.rowid_block_number(rowid)=161 and rownum<3;

OWNER OBJECT_NAME OBJECT_ID
------------------------------ -------------------- ----------
SYS GV_$LATCHNAME 2203
PUBLIC GV$LATCHNAME 2204
--定位受损块所含的对象.
SYS@orcl 21:11:03> col segment_name for a15
SYS@orcl 21:11:16> col TABLESPACE_NAME for a18
SYS@orcl 21:11:32> col PARTITION_NAME for a15
SYS@orcl 21:11:44> col owner for a15
SYS@orcl 21:11:55> r
1 SELECT tablespace_name, segment_type, owner, segment_name, partition_name
2 FROM dba_extents
3 WHERE file_id = &file_id
4* AND &block_id BETWEEN block_id AND block_id + blocks - 1
Enter value for file_id: 8
old 3: WHERE file_id = &file_id
new 3: WHERE file_id = 8
Enter value for block_id: 163
old 4: AND &block_id BETWEEN block_id AND block_id + blocks - 1
new 4: AND 163 BETWEEN block_id AND block_id + blocks - 1

TABLESPACE_NAME SEGMENT_TYPE OWNER SEGMENT_NAME PARTITION_NAME
------------------ ------------------ --------------- --------------- ---------------
TBS_TMP TABLE SCOTT TB_TMP
3.4、备份影响
3.4.1、验证无法备份
对于损坏的数据文件,缺省情况下,不能对其进行备份,如下:
RMAN> backup datafile 8 tag='corruption';

Starting backup at 25-DEC-22
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=2284 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00008 name=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
channel ORA_DISK_1: starting piece 1 at 25-DEC-22
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 12/25/2022 21:13:10
ORA-19566: exceeded limit of 0 corrupt blocks for file /u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
3.4.2、解决备份Issue
--需要设定允许损坏块的数量之后才能进行备份
RMAN> run{
2> set maxcorrupt for datafile 8 to 2;
3> backup datafile 8 tag='corruption';
4> }

executing command: SET MAX CORRUPT

Starting backup at 25-DEC-22
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00008 name=/u01/app/oracle/oradata/orcl/tbs_tmp01.dbf
channel ORA_DISK_1: starting piece 1 at 25-DEC-22
channel ORA_DISK_1: finished piece 1 at 25-DEC-22
piece handle=/u01/app/oracle/fast_recovery_area/ORCL/backupset/2022_12_25/o1_mf_nnndf_CORRUPTION_ktjm614t_.bkp tag=CORRUPTION comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 25-DEC-22
--查看备份信息如下,应在修复坏块后重新备份以避免由于保留策略导致先前可用的备份被aged out.
RMAN> list backup summary;


List of Backups
===============
Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag
------- -- -- - ----------- --------------- ------- ------- ---------- ---
14 B F A DISK 24-DEC-22 1 1 NO TAG20221224T094143
15 B F A DISK 24-DEC-22 1 1 NO TAG20221224T094143
16 B F A DISK 25-DEC-22 1 1 NO HEALTH
17 B F A DISK 25-DEC-22 1 1 NO CORRUPTION
4、总结
a、对于受损的数据块,仅仅坏块上的数据无法被查询或读取,其余正常块的数据依旧可以使用.
b、对于受损的表对象进行聚合等相关运算时收到错误提示,因为坏块上的数据无法被统计.如果聚合的是索引列,索引未损坏的情形则可正常返回.
c、可以基于RMAN可用的备份文件实现块介质恢复,其数据文件无需offline,开销最小,影响最小.
d、对于多个数据块的损坏,先执行backup validate校验数据库或相应的数据文件以便标记受损的坏块后,填充v$database_block_corruption以及后续恢复.
e、对于使用backup validate 校验后的情形,坏块恢复时可以直接使用blockrecover corruption list一次性恢复所有的坏块.
f、缺省情况下,存在坏块的数据文件无法成功备份,也会导致自动备份脚本失败.

参考文档:https://blog.51cto.com/lhrbest/3291035