Saturday, November 26, 2011
Solaris: Useful ok prompt (OBP) commands
The boot process can be aborted, if you are at the terminal, the abort sequences are Stop-A, L1-A or Break (depending on keyboard type), which can be emulated over remote consoles (rsc, alom, ilom, xscf)
Entering the following commands would obviously have consequences, and they are:
boot - boots to multi user mode
boot cdrom - boots from cdrom
boot -r - reconfiguration boot
boot -a - interactive reboot
boot -s - single user mode
Labels:
my techie notes,
OBP,
OBP boot commands,
OBP commands,
solaris
Solaris: init runlevels
Run levels are basically the operating mode that the Solaris OS will boot to after initiation.
Run levels that are available on the Solaris OS:
init 0 - power down, and drops to the OBP (Open Boot Prompt a.k.a. the ok> prompt)
init S/s - single user mode, used for maintenance or for troubleshooting
init 1 - administrative state, all file systems are mounted and local logins are allowed
init 2 - multi user, all daemons running except for nfs - nfs exports are not allowed
init 3 - multi user, this is where the Solaris OS goes live
init 4 - unused alternative multi users, unused unless defined
init 5 - power down - shuts down the OS and automatically power off if it is supported
init 6 - reboot
Run levels that are available on the Solaris OS:
init 0 - power down, and drops to the OBP (Open Boot Prompt a.k.a. the ok> prompt)
init S/s - single user mode, used for maintenance or for troubleshooting
init 1 - administrative state, all file systems are mounted and local logins are allowed
init 2 - multi user, all daemons running except for nfs - nfs exports are not allowed
init 3 - multi user, this is where the Solaris OS goes live
init 4 - unused alternative multi users, unused unless defined
init 5 - power down - shuts down the OS and automatically power off if it is supported
init 6 - reboot
Labels:
my techie notes,
runleves,
solaris,
solaris runlevels,
unix runlevels
Wednesday, November 9, 2011
Solaris: Checking which process is bound to which TCP/UDP port number
On solaris, you'll need lsof, but it seems that lsof is not part of the Solaris default installables. It should be available on Sunfreeware
# /usr/local/bin/lsof -i:123
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
xntpd 335 root 19u IPv4 0x600282a2c40 0t0 UDP *:ntp
xntpd 335 root 20u IPv4 0x600282a2840 0t0 UDP localhost:ntp
xntpd 335 root 21u IPv4 0x6002cb81ac0 0t0 UDP scglob20:ntp
xntpd 448 root 19u IPv4 0x30016a2c800 0t0 UDP *:ntp
xntpd 448 root 20u IPv4 0x6002f088500 0t0 UDP localhost:ntp
xntpd 448 root 21u IPv4 0x30016a2ca00 0t0 UDP 161.19.6.146:ntp
Else, you can do the following, the pid the following command could return multiple results
# port=25
# for i in `ls /proc`; do
> pfiles $i | grep AF_INET | grep "$port" 2>&1 >/dev/null; portfound=$?
> if [ $portfound -eq 0 ]; then
> echo "pid $i found to be bounded to port $port"
> fi
>done
pid 17175 found
pid 2874 found
pid 29348 found
pid 4482 found
pid 463 found
pid 6696 found
This can be followed with, use the pid output from the earlier output
# ps -eo user,pid,comm | grep
e.g.
# ps -eo user,pid,comm | grep 17175
root 17175 /usr/lib/sendmail
AIX: Checking which process is bound to which TCP/UDP port number
Had no idea what so ever at all on how to do this at first, Linux has a very convenient way of doing this. Say if you want to find out which process is bound to port 8080.
# netstat -Aan | grep 8080
f1000200031b6b98 tcp4 0 0 *.8080 *.* LISTEN
# rmsock f1000200031b6b98 tcpcb
The socket 0x31b6808 is being held by proccess 1306834 (httpd).
# netstat -Aan | grep 8080
f1000200031b6b98 tcp4 0 0 *.8080 *.* LISTEN
# rmsock f1000200031b6b98 tcpcb
The socket 0x31b6808 is being held by proccess 1306834 (httpd).
Solaris: Calculating swap utilization
In the case, where you'd have system monitoring in place and you need to find out how much swap space has been utilized. The way of calculating the percentage of swap space utilized on a Solaris system is:
% swap utilized
= (swap used / swap total) %
The swap total should be:
swap total = swap used + swap available
And when you subsitute the value from swap -s
# swap -s
total: 10527072k bytes allocated + 1647056k reserved = 12174128k used, 3572696k available
actual swap total
= 12174128k + 3572696k
= 15746824k
% swap utilized
= (12174128k/15746824k) * 100
= 77.31 % (rounded to the nearest two decimals points)
You can conveniently script this using bourne or bourne again shell.
swapinfo=`swap -s`
swapused=`echo $swapinfo | awk '{print $9}' | sed 's/k//'`
swapavail=`echo $swapinfo | awk '{print $11}' | sed 's/k//'`
swaptotal=`echo $swapused+$swapavail | bc`
swapusedpercent=`echo "scale=5; ($swapused/$swaptotal)*100" | bc -l`
echo "Swap utilization is at $swapusedpercent %"
% swap utilized
= (swap used / swap total) %
The swap total should be:
swap total = swap used + swap available
And when you subsitute the value from swap -s
# swap -s
total: 10527072k bytes allocated + 1647056k reserved = 12174128k used, 3572696k available
actual swap total
= 12174128k + 3572696k
= 15746824k
% swap utilized
= (12174128k/15746824k) * 100
= 77.31 % (rounded to the nearest two decimals points)
You can conveniently script this using bourne or bourne again shell.
swapinfo=`swap -s`
swapused=`echo $swapinfo | awk '{print $9}' | sed 's/k//'`
swapavail=`echo $swapinfo | awk '{print $11}' | sed 's/k//'`
swaptotal=`echo $swapused+$swapavail | bc`
swapusedpercent=`echo "scale=5; ($swapused/$swaptotal)*100" | bc -l`
echo "Swap utilization is at $swapusedpercent %"
AIX: enclosure0 and enclosure1 error
Encountered the following error, as reported by the AIX errpt utility
# errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
BD797922 1108110011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108110011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108100011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108100011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108093511 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108093511 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108070011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108070011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108060011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108060011 P H enclosure0 SUBSYSTEM FAILURE
AA8AB241 1108050111 T O OPERATOR OPERATOR NOTIFICATION
AA8AB241 1108050111 T O OPERATOR OPERATOR NOTIFICATION
Dig out more information on the error on enclosure0
# errpt -aj enclosure0
---------------------------------------------------------------------------
LABEL: SSA_ENCL_ERR1
IDENTIFIER: BD797922
Date/Time: Tue Nov 8 11:00:30 2011
Sequence Number: 5643
Machine Id: 0055617A4C00
Node Id: riju26
Class: H
Type: PERM
Resource Name: enclosure0
Resource Class: container
Resource Type: ses
Location: USSA33C8
VPD:
Part Number.................9L1850
Serial Number...............AC1433C8
EC Level....................000000R000
Manufacturer................IBM053
ROS Level and ID............0020
Device Specific.(Z0)........DISPLAY=33C8
Device Specific.(Z1)........BYPASS1_16= 09L5510
Device Specific.(Z2)........BYPASS4_5= 09L5510
Device Specific.(Z3)........BYPASS8_9= 09L5510
Device Specific.(Z4)........BYPASS12_13= 09L5510
Device Specific.(Z5)........FAN1=09L2794
Device Specific.(Z6)........FAN2=09L2794
Device Specific.(Z7)........FAN3=09L2794
Device Specific.(Z8)........PSU1=
Device Specific.(Z9)........PSU2=09L4299
Device Specific.(ZA)........CTRL= 34L3820
Device Specific.(ZB)........OPERATOR= 08L7924
Description
SUBSYSTEM FAILURE
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0802 2100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
And on enclosure1
---------------------------------------------------------------------------
LABEL: SSA_ENCL_ERR1
IDENTIFIER: BD797922
Date/Time: Tue Nov 8 11:00:30 2011
Sequence Number: 5644
Machine Id: 0055617A4C00
Node Id: riju26
Class: H
Type: PERM
Resource Name: enclosure1
Resource Class: container
Resource Type: ses
Location: USSA56E7
VPD:
Part Number.................9L1850
Serial Number...............292C56E7
EC Level....................000000R000
Manufacturer................IBM053
ROS Level and ID............0020
Device Specific.(Z0)........DISPLAY=56E7
Device Specific.(Z1)........BYPASS1_16= 09L5580
Device Specific.(Z2)........BYPASS4_5= 09L5580
Device Specific.(Z3)........BYPASS8_9= 09L5580
Device Specific.(Z4)........BYPASS12_13= 09L5580
Device Specific.(Z5)........FAN1=09L2794
Device Specific.(Z6)........FAN2=09L2794
Device Specific.(Z7)........FAN3=09L2794
Device Specific.(Z8)........PSU1=09L4299
Device Specific.(Z9)........PSU2=
Device Specific.(ZA)........CTRL= 27H0708
Device Specific.(ZB)........OPERATOR= 08L7924
Description
SUBSYSTEM FAILURE
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0802 2200 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
And if you will notice that the lines for enclosure0
Device Specific.(Z8)........PSU1=
and, for enclosure1
Device Specific.(Z9)........PSU2=
It looks rather suspicious doesn't it, with my obvious lack of experience with AIX or IBM RS/6000 (7026-6H1), prompted me to look up the 7133 Models D40 and T40 Serial Disk Systems Service Guide. This took me a while to figure out as I wasn't sure which documentations to refer to, the IBM RS/6000 under 7026-6H1 or the IBM SSA 160 SerialRAID adapter. And at last, found what I was looking based on the SRN generated for enclosure0 and enclosure1.
# ssa_ela
enclosure0 SRN 80221
enclosure1 SRN 80222
Oh and, enclosure0 and enclosure1 mentioned are the disk enclosures, for this case.
Earlier on, I was mentioning something on the suspicious, PSU1 and PSU2 on enclosure0 and enclosure1 respectively. And apparently they're missing. It takes a lot of guess work to do perform hardware diagnostics remotely for a server that is halfway around the world.
# ssaencl -l enclosure0 -p
enclosure enclosure0
component PSU_1
present FALSE
enclosure enclosure0
component PSU_2
present TRUE
fault FALSE
exchanged FALSE
# ssaencl -l enclosure1 -p
enclosure enclosure1
component PSU_1
present TRUE
fault FALSE
exchanged FALSE
enclosure enclosure1
component PSU_2
present FALSE
# errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
BD797922 1108110011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108110011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108100011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108100011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108093511 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108093511 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108070011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108070011 P H enclosure0 SUBSYSTEM FAILURE
BD797922 1108060011 P H enclosure1 SUBSYSTEM FAILURE
BD797922 1108060011 P H enclosure0 SUBSYSTEM FAILURE
AA8AB241 1108050111 T O OPERATOR OPERATOR NOTIFICATION
AA8AB241 1108050111 T O OPERATOR OPERATOR NOTIFICATION
Dig out more information on the error on enclosure0
# errpt -aj enclosure0
---------------------------------------------------------------------------
LABEL: SSA_ENCL_ERR1
IDENTIFIER: BD797922
Date/Time: Tue Nov 8 11:00:30 2011
Sequence Number: 5643
Machine Id: 0055617A4C00
Node Id: riju26
Class: H
Type: PERM
Resource Name: enclosure0
Resource Class: container
Resource Type: ses
Location: USSA33C8
VPD:
Part Number.................9L1850
Serial Number...............AC1433C8
EC Level....................000000R000
Manufacturer................IBM053
ROS Level and ID............0020
Device Specific.(Z0)........DISPLAY=33C8
Device Specific.(Z1)........BYPASS1_16= 09L5510
Device Specific.(Z2)........BYPASS4_5= 09L5510
Device Specific.(Z3)........BYPASS8_9= 09L5510
Device Specific.(Z4)........BYPASS12_13= 09L5510
Device Specific.(Z5)........FAN1=09L2794
Device Specific.(Z6)........FAN2=09L2794
Device Specific.(Z7)........FAN3=09L2794
Device Specific.(Z8)........PSU1=
Device Specific.(Z9)........PSU2=09L4299
Device Specific.(ZA)........CTRL= 34L3820
Device Specific.(ZB)........OPERATOR= 08L7924
Description
SUBSYSTEM FAILURE
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0802 2100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
And on enclosure1
---------------------------------------------------------------------------
LABEL: SSA_ENCL_ERR1
IDENTIFIER: BD797922
Date/Time: Tue Nov 8 11:00:30 2011
Sequence Number: 5644
Machine Id: 0055617A4C00
Node Id: riju26
Class: H
Type: PERM
Resource Name: enclosure1
Resource Class: container
Resource Type: ses
Location: USSA56E7
VPD:
Part Number.................9L1850
Serial Number...............292C56E7
EC Level....................000000R000
Manufacturer................IBM053
ROS Level and ID............0020
Device Specific.(Z0)........DISPLAY=56E7
Device Specific.(Z1)........BYPASS1_16= 09L5580
Device Specific.(Z2)........BYPASS4_5= 09L5580
Device Specific.(Z3)........BYPASS8_9= 09L5580
Device Specific.(Z4)........BYPASS12_13= 09L5580
Device Specific.(Z5)........FAN1=09L2794
Device Specific.(Z6)........FAN2=09L2794
Device Specific.(Z7)........FAN3=09L2794
Device Specific.(Z8)........PSU1=09L4299
Device Specific.(Z9)........PSU2=
Device Specific.(ZA)........CTRL= 27H0708
Device Specific.(ZB)........OPERATOR= 08L7924
Description
SUBSYSTEM FAILURE
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0802 2200 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
---------------------------------------------------------------------------
And if you will notice that the lines for enclosure0
Device Specific.(Z8)........PSU1=
and, for enclosure1
Device Specific.(Z9)........PSU2=
It looks rather suspicious doesn't it, with my obvious lack of experience with AIX or IBM RS/6000 (7026-6H1), prompted me to look up the 7133 Models D40 and T40 Serial Disk Systems Service Guide. This took me a while to figure out as I wasn't sure which documentations to refer to, the IBM RS/6000 under 7026-6H1 or the IBM SSA 160 SerialRAID adapter. And at last, found what I was looking based on the SRN generated for enclosure0 and enclosure1.
# ssa_ela
enclosure0 SRN 80221
enclosure1 SRN 80222
Oh and, enclosure0 and enclosure1 mentioned are the disk enclosures, for this case.
Earlier on, I was mentioning something on the suspicious, PSU1 and PSU2 on enclosure0 and enclosure1 respectively. And apparently they're missing. It takes a lot of guess work to do perform hardware diagnostics remotely for a server that is halfway around the world.
# ssaencl -l enclosure0 -p
enclosure enclosure0
component PSU_1
present FALSE
enclosure enclosure0
component PSU_2
present TRUE
fault FALSE
exchanged FALSE
# ssaencl -l enclosure1 -p
enclosure enclosure1
component PSU_1
present TRUE
fault FALSE
exchanged FALSE
enclosure enclosure1
component PSU_2
present FALSE
Subscribe to:
Posts (Atom)