ASM磁盘故障处理

李少鹏,现就职于北京海天起点技术服务股份有限公司,具有多年oracle数据库维护经验,有11G OCP认证,致力于帮助客户解决生产环境出现的各种问题。

 

环境介绍

客户环境为两台 Linux x86 64-bit构建的RAC环境,系统均为Red Hat Enterprise Linux Server release 6.8 (Santiago),数据库版本为oracle 11.2.0.4.0.

故障介绍

11月份为客户巡检时发现一块ASM磁盘offline,随即进行排查处理

故障排查

查看ASM磁盘状态

11月份ASM Disk Info

DISKGROUP

DISK_NAME

DISK_NUMBER

PATH

HEADER_STATU

MODE_STATUS

TOTAL_MB

FREE_MB

FAILGROUP

ARCH _DROPPED_0001_ARCH

1

  UNKNOWN OFFLINE

204800

202664

ARCH_0001
  ARCH_0000

0

/dev/mapper/st1_disk01 MEMBER ONLINE

204800

202664

ARCH_0000
ASMCRS ASMCRS_0002

2

/dev/mapper/st2_disk06 MEMBER ONLINE

10240

9930

ASMCRS_0002
  ASMCRS_0001

1

/dev/mapper/st1_disk07 MEMBER ONLINE

10240

9932

ASMCRS_0001
  ASMCRS_0000

0

/dev/mapper/st1_disk06 MEMBER ONLINE

10240

9932

ASMCRS_0000
DATA DATA_0002

2

/dev/mapper/st1_disk04 MEMBER ONLINE

204800

161655

DATA_0002
  DATA_0003

3

/dev/mapper/st1_disk05 MEMBER ONLINE

204800

161650

DATA_0003
  DATA_0004

4

/dev/mapper/st2_disk02 MEMBER ONLINE

204800

161649

DATA_0004
  DATA_0007

7

/dev/mapper/st2_disk05 MEMBER ONLINE

204800

161646

DATA_0007
  DATA_0005

5

/dev/mapper/st2_disk03 MEMBER ONLINE

204800

161654

DATA_0005
  DATA_0000

0

/dev/mapper/st1_disk02 MEMBER ONLINE

204800

161652

DATA_0000
  DATA_0001

1

/dev/mapper/st1_disk03 MEMBER ONLINE

204800

161651

DATA_0001

如上,检查发现arch磁盘组有一块盘显示offline状态

10月份ASM Disk Info

DISKGROUP

DISK_NAME

DISK_NUMBER

PATH

HEADER_STATU

MODE_STATUS

TOTAL_MB

FREE_MB

FAILGROUP

ARCH ARCH_0000

0

/dev/mapper/st1_disk01 MEMBER ONLINE

204800

202664

ARCH_0000
  ARCH_0001

1

/dev/mapper/st2_disk01 MEMBER ONLINE

204800

202664

ARCH_0001
ASMCRS ASMCRS_0000

0

/dev/mapper/st1_disk06 MEMBER ONLINE

10240

9932

ASMCRS_0000
  ASMCRS_0002

2

/dev/mapper/st2_disk06 MEMBER ONLINE

10240

9930

ASMCRS_0002
  ASMCRS_0001

1

/dev/mapper/st1_disk07 MEMBER ONLINE

10240

9932

ASMCRS_0001
DATA DATA_0000

0

/dev/mapper/st1_disk02 MEMBER ONLINE

204800

167161

DATA_0000
  DATA_0004

4

/dev/mapper/st2_disk02 MEMBER ONLINE

204800

167158

DATA_0004
  DATA_0005

5

/dev/mapper/st2_disk03 MEMBER ONLINE

204800

167156

DATA_0005
  DATA_0006

6

/dev/mapper/st2_disk04 MEMBER ONLINE

204800

167152

DATA_0006
  DATA_0007

7

/dev/mapper/st2_disk05 MEMBER ONLINE

204800

167150

DATA_0007
  DATA_0001

1

/dev/mapper/st1_disk03 MEMBER ONLINE

204800

167161

DATA_0001
  DATA_0002

2

/dev/mapper/st1_disk04 MEMBER ONLINE

204800

167151

DATA_0002
  DATA_0003

3

/dev/mapper/st1_disk05 MEMBER ONLINE

204800

167165

DATA_0003

对比10月份和11月份的ASM磁盘信息,发现11月份ASM磁盘缺少两块,询问客户是否做过ASM 磁盘的调整,答曰没有,只是在11月巡检之前几天,客户机房因为UPS故障导致数据库重启过

查询ASM日志

查询ASM日志发现在客户所说机房UPS故障之日,ARCH磁盘组和DATA磁盘组中的一块ASM磁盘相继被剔出其所在磁盘组。

查看系统ASM磁盘状态

查询当下ASM磁盘状态

如上可知,/dev/mapper/st2_disk01和/dev/mapper/st2_disk04两块盘为closed状态,closed表示磁盘未被当前实例使用,member表示这个磁盘已经是某个diskgroup的成员了,对比10月份Asm磁盘信息,可知其为缺失的两块盘

故障处理

因为header_status状态为member,所以如果直接alter diskgroup add disk加磁盘的话,会报错

首先做磁盘清理,把header_status状态改成CANDIDATE

清理磁盘

再次查询ASM磁盘状态

header_status的状态为candidate表示可以添加进磁盘组了

添加磁盘

添加完磁盘之后,等待其自动reblance即可。

未经允许不得转载:Oracle一体机用户组 » ASM磁盘故障处理

相关推荐