New member trolling posts and figured I'd chime in on this one. As far as pbx health checks go, you can draft a routing doc that cover the more critical portions of the switch.
In doing these checks and depending on what connection type your using (ProComm, PuTTy, EM etc) I'd suggest capturing the data and performing these daily.
History File for the previous day to current. In log will show all the burps the switch went through
ld 22
prt
ahst
<cr>
ld 96
stat dch
This retrieves the current status of d-channels
ld 60
lcnt
Lists the current error count on your connected circuits. These counters should reset during midnight routines. Any errors like Slips 20 and over should be looked at
ld 60
stat
Retrieves current circuit status
ld 48
stat msdl
Stat's the msdl cards and associated ports.
ld 135
stat mem
stat cpu
stat cni
These check the status of your cpu's, memory and cni cards. In a dual core switch 1 core should be active the other in redundant mode
ld 137
stat
stat's the elan link and cmdu's
ld 37
stat tty
stat's the configured tty ports
One thing to consider, and if your switch has it in your midnights is to look for codes with a NWS and associated TN. The switch performs signalling tests out to the phones, if phones are unplugg or have issues it will automatically disable those sets. Keep the list from a few days, compare it, and if the same phones come up, out them. Garbage in garbage out is a good rule of thumb