The pe1950 server has integrated LSI hardware and regular MPT modules present in RHEL4.X/5.X will recognize this hardware / so that installation is trivial (a non-issue)
However, to actually monitor the health of the raid array isn't instantly trivial, and while dell provided tools under linux (OMSA suite) do poll the integrated raid card for health; in some circumstances you may not want to install OMSA to be able to monitor internal raid health. (In my case, I didn't want to install OMSA to all 25 compute nodes in a compute cluster to be able to poll the health of the OS hardware raid mirrors)
Some digging with google located a 3rd party open source freeware, "mpt-status", which makes use of the mptctl kernel module, to generate a easily human readable report on the status of the internal raid.
This can subsequently be used as a trivial way to generate a basic script that can be called by CRON (say, every night) to notify in case of any non-optimal circumstances.
- Get yourself a copy of the RPM (or compile from source), mpt-status: Available at the URL, http://www.drugphish.ch/~ratz/mpt-status/ or for the RPM specifically, http://www.drugphish.ch/~ratz/mpt-status/RPMS/1.2.0_RC7/mpt-status-1.2.0_RC7-3.i386.rpm
- install the rpm to your system, "rpm --install mpt-status-1.2.0_RC7-3.i386.rpm"
- try calling it to see what happens. If it complains you may need to "mknod" - follow the suggestion. If it complains that mltctl module is not loaded, then load the module.
See a capture below of an example of this sequence:
root@box nov-13-08-mptsas-status]# ls -la total 204 drwxr-xr-x 3 root root 4096 Nov 13 08:40 . drwxr-xr-x 9 root root 4096 Nov 13 09:08 .. drwxr-xr-x 6 501 501 4096 Nov 13 08:37 mpt-status-1.2.0 -rw-r--r-- 1 root root 27986 Jun 30 2006 mpt-status-1.2.0_RC7-3.i386.rpm -rw-r--r-- 1 root root 153600 Nov 5 2006 mpt-status-1.2.0.tar -rw-r--r-- 1 root root 82 Nov 13 08:40 README.txt -rw-r--r-- 1 root root 65 Nov 13 08:33 src-url [root@box nov-13-08-mptsas-status]# rpm --install mpt-status-1.2.0_RC7-3.i386.rpm [root@box nov-13-08-mptsas-status]# mpt-status open /dev/mptctl: No such file or directory Try: mknod /dev/mptctl c 10 220 Make sure mptctl is loaded into the kernel [root@box nov-13-08-mptsas-status]# mknod /dev/mptctl c 10 220 [root@box nov-13-08-mptsas-status]# mpt-status open /dev/mptctl: No such device Are you sure your controller is supported by mptlinux? Make sure mptctl is loaded into the kernel [root@box nov-13-08-mptsas-status]# modprobe mptctl [root@box nov-13-08-mptsas-status]# mpt-status ioc0 vol_id 0 type IM, 2 phy, 465 GB, state OPTIMAL, flags ENABLED ioc0 phy 1 scsi_id 9 ATA ST3500320NS MA07, 465 GB, state ONLINE, flags NONE ioc0 phy 0 scsi_id 1 ATA ST3500320NS MA07, 465 GB, state ONLINE, flags NONE
- now that it works, ensure that the module is loaded each time your system reboots by adding a line to /etc/rc.modules reading "modprobe mptctl"
- Note that the file /etc/rc.modules needs to have execute bit set, ie, chmod 700 for that file, in order for it to run-load at boot properly. (otherwise this modprobe does not actually happen when system is booted).
- Provided below is a trivial script you could call by cron at regular intervals (nightly?) to poll for health of disks, and notify in case things are needing attention.
#!/bin/bash # # Small script called by cron nightly to poll for health # of the local OS Mirror raid arrays, and notify if things are amiss. # # TDC Nov-13-08 # ################################################################ HOSTNAME=`hostname` mpt-status > /tmp/head-node-raid-health-check-temporary-file # # First confirm we have all nodes reporting back some kind of status, else throw error. # # Note there are 1 nodes with 2 x ST3500 HDDs per unit, for 2 drives in total expected to report. DRIVES=`cat /tmp/head-node-raid-health-check-temporary-file | grep -c ST3500` # # We also anticipate to have 1 counts of "OPTIMAL" returned, one per raid set / one per system OPTIMAL=`cat /tmp/head-node-raid-health-check-temporary-file | grep -c OPTIMAL` # # Remove the temp file, so that it is not present for next time. rm /tmp/head-node-raid-health-check-temporary-file # # # Now, we do some logic to test that all is well in Denmark. if [[ $OPTIMAL == "1" ]] then EXIT_PAINLESSLY="true" else echo "$HOSTNAME reports $DRIVES of 2 expected HDDs reporting on raid health, with $OPTIMAL of 1 raid sets reporting optimal health - PLEASE VERIFY IMMEDIATELY" | mail -s "Possible RAID Errors on $HOSTNAME" systems fi