Encountered the following error, as reported by the AIX errpt utility
# errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
BD797922 1108110011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108110011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108100011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108100011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108093511 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108093511 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108070011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108070011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108060011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108060011 P H enclosure0 SUBSYSTEM FAILURE
AA8AB241 1108050111 T O OPERATOR OPERATOR NOTIFICATION
AA8AB241 1108050111 T O OPERATOR OPERATOR NOTIFICATION
Dig out more information on the error on enclosure0
# errpt -aj enclosure0
---------------------------------------------------------------------------
LABEL: SSA_ENCL_ERR1
IDENTIFIER: BD797922
Date/Time: Tue Nov 8 11:00:30 2011
Sequence Number: 5643
Machine Id: 0055617A4C00
Node Id: riju26
Class: H
Type: PERM
Resource Name: enclosure0
Resource Class: container
Resource Type: ses
Location: USSA33C8
VPD:
Part Number.................9L1850
Serial Number...............AC1433C8
EC Level....................000000R000
Manufacturer................IBM053
ROS Level and ID............0020
Device Specific.(Z0)........DISPLAY=33C8
Device Specific.(Z1)........BYPASS1_16= 09L5510
Device Specific.(Z2)........BYPASS4_5= 09L5510
Device Specific.(Z3)........BYPASS8_9= 09L5510
Device Specific.(Z4)........BYPASS12_13= 09L5510
Device Specific.(Z5)........FAN1=09L2794
Device Specific.(Z6)........FAN2=09L2794
Device Specific.(Z7)........FAN3=09L2794
Device Specific.(Z8)........PSU1=
Device Specific.(Z9)........PSU2=09L4299
Device Specific.(ZA)........CTRL= 34L3820
Device Specific.(ZB)........OPERATOR= 08L7924
Description
SUBSYSTEM FAILURE
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0802 2100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
And on enclosure1
---------------------------------------------------------------------------
LABEL: SSA_ENCL_ERR1
IDENTIFIER: BD797922
Date/Time: Tue Nov 8 11:00:30 2011
Sequence Number: 5644
Machine Id: 0055617A4C00
Node Id: riju26
Class: H
Type: PERM
Resource Name: enclosure1
Resource Class: container
Resource Type: ses
Location: USSA56E7
VPD:
Part Number.................9L1850
Serial Number...............292C56E7
EC Level....................000000R000
Manufacturer................IBM053
ROS Level and ID............0020
Device Specific.(Z0)........DISPLAY=56E7
Device Specific.(Z1)........BYPASS1_16= 09L5580
Device Specific.(Z2)........BYPASS4_5= 09L5580
Device Specific.(Z3)........BYPASS8_9= 09L5580
Device Specific.(Z4)........BYPASS12_13= 09L5580
Device Specific.(Z5)........FAN1=09L2794
Device Specific.(Z6)........FAN2=09L2794
Device Specific.(Z7)........FAN3=09L2794
Device Specific.(Z8)........PSU1=09L4299
Device Specific.(Z9)........PSU2=
Device Specific.(ZA)........CTRL= 27H0708
Device Specific.(ZB)........OPERATOR= 08L7924
Description
SUBSYSTEM FAILURE
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0802 2200 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
And if you will notice that the lines for enclosure0
Device Specific.(Z8)........PSU1=
and, for enclosure1
Device Specific.(Z9)........PSU2=
It looks rather suspicious doesn't it, with my obvious lack of experience with AIX or IBM RS/6000 (7026-6H1), prompted me to look up the 7133 Models D40 and T40 Serial Disk Systems Service Guide. This took me a while to figure out as I wasn't sure which documentations to refer to, the IBM RS/6000 under 7026-6H1 or the IBM SSA 160 SerialRAID adapter. And at last, found what I was looking based on the SRN generated for enclosure0 and enclosure1.
# ssa_ela
enclosure0 SRN 80221
enclosure1 SRN 80222
Oh and, enclosure0 and enclosure1 mentioned are the disk enclosures, for this case.
Earlier on, I was mentioning something on the suspicious, PSU1 and PSU2 on enclosure0 and enclosure1 respectively. And apparently they're missing. It takes a lot of guess work to do perform hardware diagnostics remotely for a server that is halfway around the world.
# ssaencl -l enclosure0 -p
enclosure enclosure0
component PSU_1
present FALSE
enclosure enclosure0
component PSU_2
present TRUE
fault FALSE
exchanged FALSE
# ssaencl -l enclosure1 -p
enclosure enclosure1
component PSU_1
present TRUE
fault FALSE
exchanged FALSE
enclosure enclosure1
component PSU_2
present FALSE
No comments:
Post a Comment