07-12-2010, 01:30 AM
We've had some random reboots since I moved us to the new server, I've put off fixing it until now, I bought us 2 replacement power supplies since the hardware has redundancy.
The majority of these reboots were caused by random power supply failures.
[root]@[teletran1][02:28:58][~]$ last reboot
reboot system boot 2.6.18-194.3.1.e Fri Jul 9 13:20 (2+13:08)
reboot system boot 2.6.18-194.3.1.e Wed Jun 16 14:27 (25+12:01)
reboot system boot 2.6.18-194.3.1.e Mon Jun 7 19:31 (34+06:57)
reboot system boot 2.6.18-194.3.1.e Fri May 28 05:38 (44+20:50)
reboot system boot 2.6.18-194.3.1.e Fri May 28 05:31 (44+20:57)
reboot system boot 2.6.18-194.3.1.e Mon May 24 02:23 (49+00:05)
reboot system boot 2.6.18-194.3.1.e Tue May 18 04:20 (5+22:00)
reboot system boot 2.6.18-164.9.1.e Wed Jan 27 13:38 (116+11:43)
reboot system boot 2.6.18-164.9.1.e Wed Jan 20 14:47 (123+10:33)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 20:35 (145+04:45)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 16:39 (03:53)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 16:31 (00:06)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 16:17 (00:11)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 08:06 (08:08)
reboot system boot 2.6.18-164.9.1.e Sun Dec 20 13:12 (00:15)
reboot system boot 2.6.18-164.9.1.e Sun Dec 20 13:01 (00:09)
reboot system boot 2.6.18-164.9.1.e Sat Dec 19 06:10 (1+06:48)
reboot system boot 2.6.18-164.9.1.e Sat Dec 19 05:47 (00:21)
reboot system boot 2.6.18-164.9.1.e Thu Dec 17 19:27 (1+10:17)
reboot system boot 2.6.18-164.el5 Thu Dec 17 19:12 (00:12)
wtmp begins Thu Dec 17 19:12:52 2009
and here is the data from hp system monitor:
[root]@[teletran1][13:19:50][~]$ hpasmcli
HP management CLI for Linux (v1.0)
Copyright 2004 Hewlett-Packard Development Group, L.P.
--------------------------------------------------------------------------
NOTE: Some hpasmcli commands may not be supported on all Proliant servers.
Type 'help' to get a list of all top level commands.
--------------------------------------------------------------------------
hpasmcli> iml show
Invalid Command
hpasmcli> show iml
Event: 0 Added: 07/06/2010 22:50
REPAIRED: Network Adapter - Network Adapter Link Down (Slot 0, Port 1).
Event: 1 Added: 07/09/2010 13:21
CAUTION: Power Subsystem - System Power Supply: General Failure (Power Supply 2).
Event: 2 Added: 07/09/2010 13:21
CAUTION: Power Subsystem - System Power Supplies Not Redundant.
hpasmcli> exit
The new power supplies should be in this week or the next.
The majority of these reboots were caused by random power supply failures.
[root]@[teletran1][02:28:58][~]$ last reboot
reboot system boot 2.6.18-194.3.1.e Fri Jul 9 13:20 (2+13:08)
reboot system boot 2.6.18-194.3.1.e Wed Jun 16 14:27 (25+12:01)
reboot system boot 2.6.18-194.3.1.e Mon Jun 7 19:31 (34+06:57)
reboot system boot 2.6.18-194.3.1.e Fri May 28 05:38 (44+20:50)
reboot system boot 2.6.18-194.3.1.e Fri May 28 05:31 (44+20:57)
reboot system boot 2.6.18-194.3.1.e Mon May 24 02:23 (49+00:05)
reboot system boot 2.6.18-194.3.1.e Tue May 18 04:20 (5+22:00)
reboot system boot 2.6.18-164.9.1.e Wed Jan 27 13:38 (116+11:43)
reboot system boot 2.6.18-164.9.1.e Wed Jan 20 14:47 (123+10:33)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 20:35 (145+04:45)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 16:39 (03:53)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 16:31 (00:06)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 16:17 (00:11)
reboot system boot 2.6.18-164.9.1.e Tue Dec 29 08:06 (08:08)
reboot system boot 2.6.18-164.9.1.e Sun Dec 20 13:12 (00:15)
reboot system boot 2.6.18-164.9.1.e Sun Dec 20 13:01 (00:09)
reboot system boot 2.6.18-164.9.1.e Sat Dec 19 06:10 (1+06:48)
reboot system boot 2.6.18-164.9.1.e Sat Dec 19 05:47 (00:21)
reboot system boot 2.6.18-164.9.1.e Thu Dec 17 19:27 (1+10:17)
reboot system boot 2.6.18-164.el5 Thu Dec 17 19:12 (00:12)
wtmp begins Thu Dec 17 19:12:52 2009
and here is the data from hp system monitor:
[root]@[teletran1][13:19:50][~]$ hpasmcli
HP management CLI for Linux (v1.0)
Copyright 2004 Hewlett-Packard Development Group, L.P.
--------------------------------------------------------------------------
NOTE: Some hpasmcli commands may not be supported on all Proliant servers.
Type 'help' to get a list of all top level commands.
--------------------------------------------------------------------------
hpasmcli> iml show
Invalid Command
hpasmcli> show iml
Event: 0 Added: 07/06/2010 22:50
REPAIRED: Network Adapter - Network Adapter Link Down (Slot 0, Port 1).
Event: 1 Added: 07/09/2010 13:21
CAUTION: Power Subsystem - System Power Supply: General Failure (Power Supply 2).
Event: 2 Added: 07/09/2010 13:21
CAUTION: Power Subsystem - System Power Supplies Not Redundant.
hpasmcli> exit
The new power supplies should be in this week or the next.