Exadata更换BBU手册

作者简介:王福贵,现就职于北京海天起点技术服务股份有限公司。担任Oracle一体机高级工程师,具有12年以上电信、政府、石油、汽车行业系统运维经验。持有SUN、IBM、Oracle等多项原厂认证证书。擅长Oracle Exadata一体机、Oracle Exalogic一体机及SUN产品运维、优化和故障诊断。

BBU更换按照设备型号不同,可以分为在线更换(X3-2 and later)和关机更换(x2-2 and prior)两种。检查电池状态和容量命令:MegaCli64 -AdpBbuCmd -a0 | grep ‘Capacity’

存储节点更换BBU

关机更换BBU

登录到DB节点检查维修时间参数值,默认为3.6H

调整到8.5H

检查griddisk状态,应该为online和

Inactive griddisk/span>

检查griddisk状态,应该为online和yes.

列出griddisk,应该显示为inactive状态

关闭系统

断掉电源线,更换电池。

开机确认BBU信息和active griddisk

确认 disk controller BBU08 battery 状态为 Operational

设置所有 logical drives 为Write Back 模式

有些时候,BBU可能进入了learn cycle模式,输出如下:

大概在1小时后,learn cycle将会完成。然后将变为WriteBack模式。

确认当前 logical drive disk cache policy to be WriteBack

检查BBU alerts,显示如下:

active griddisk

确认所有磁盘状态为active

确认griddisk状态为online状态

存储节点在线更换BBU

使用以下命令drop掉BBU

确认BBU被drop掉

物理在线更换slot 7 BBU。

确认BBU为operational状态.

进一步检查BBU状态

计算节点更换BBU

计算节点在线更换BBU

使用以下命令drop掉BBU

确认bbu已经被drop掉

物理在线更换slot 7 BBU

确认BBU为operational状态.

使用以下命令确认当前 logical disk drive cache策略使用的是 writeback 模式。 如果 cache策略不是 writeback, 请执行step f.如果cache策略是writeback, 执行 step g.

使用以下命令确认 BBU 状态为 operational. This step is only necessary when the cache policy outputfrom step e is not writeback.

执行BBU状态检查

计算节点关机更换BBU

检查 AUTOSTART 是否为 enabled:

如果上述命令输出显示为enable,则需要Disable t autostart 。

Note: This is step is [Optional] and it can required during maintenance operation like
“firmware patches” which requires to reboot the Compute Node several times.

Stop the Grid Infrastructure stack on the first database server locally:

Verify that the Grid Infrastructure stack has shutdown successfully on the database server.

The following command should show no output if the GI stack has shutdown:

Confirm that the clusterware resources are still up and running on the other nodes:

Before shutting down the DB node, check /etc/fstab for any nfs mounts that should be unmounted.

Shut the db node down so you can peform maintenance.

Perform the scheduled maintenance on the DB node.

You can proceed now with hardware replacement or maintenance.

After hardware has been replaced or maintenance performed and your ready to bring the server back up, power on the DB node by using the power button on the front panel of the Exadata Storage Servers.

Start the Grid Infrastructure stack on the database server once it comes up:

Wait until the Grid Infrastructure stack has successfully started. To check the status of the Grid Infrastructure stack, run the following command and verify that the “ora.asm” instance is started. Note that the command below will continue to report that it is unable to communicate with the Grid Infrastructure software for several minutes after issuing the “crsctl start crs” command above:

Reenable the Grid Infrastructure for autostart again since we disabled it earlier in Section 1 # 3:

Confirm that the clusterware resources are up and running on all nodes:

After completing the steps above, repeat same steps again in Section 1 on the remaining nodes (one at a time) until all hardware has been replaced on each DB node.

未经允许不得转载:Oracle一体机用户组 » Exadata更换BBU手册

相关推荐