Posts Tagged by High Availability
|14-Jan-2011||Posted by Sonia Hamilton under High Availability, Solaris, SQL|
The latest edition of the venerable UNIX and Linux System Administration Handbook (Nemeth et al) has a good section discussing the “RAID5 Write Hole”:
Finally, RAID 5 is vulnerable to corruption in certain circumstances. Its incremental updating of parity data is more efficient than reading the entire stripe and recalculating the stripe’s parity based on the original data. On the other hand, it means that at no point is parity data ever validated or recalculated. If any block in a stripe should fall out of sync with the parity block, that fact will never become evident in normal use; reads of the data blocks will still return the correct data.
Only when a disk fails does the problem become apparent. The parity block will likely have been rewritten many times since the occurrence of the original desynchronization. Therefore, the reconstructed data block on the replacement disk will consist of essentially random data.
|25-Feb-2009||Posted by Sonia Hamilton under High Availability|
When Linux HA (High Availability) is setup, each machine will have a physical address, and one machine should also have the virtual address. This can be checked via ip addr:
machine 1 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:09:3d:12:af:77 brd ff:ff:ff:ff:ff:ff inet 918.104.22.168/23 brd 22.214.171.124 scope global eth0 machine 2 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:09:3d:12:ba:ef brd ff:ff:ff:ff:ff:ff inet 9126.96.36.199/23 brd 188.8.131.52 scope global eth0 inet 9184.108.40.206/23 brd 220.127.116.11 scope global secondary eth0:1
If this isn’t the case, do a hb_takeover on the appropriate machine (depending on the status of the underlying application). Eg /usr/lib64/heartbeat/hb_takeover