Archive for October, 2008

Dynamically discovering Clariion LUNs with Linux



One of the Redhat Enterprise Linux 4 update 3 servers I support had several LUNs added to it this week. The server was using Qlogic 2340 HBAs, which allowed me to use the ql-dynamic-tgt-lun-disc.sh script from the Qlogic support site to dynamically discover the new LUNs:

$ ql-dynamic-tgt-lun-disc.sh
Scanning HOST: host1....Scanning HOST: host2.............Scanning HOST: host3....

Found

1:0:0:61:0:0:81:0:1:61:0:1:83:0:0:63:0:0:83:0:1:63:0:1:8

After the script completed device discovery, two devices were visible for each LUN (there are multiple paths to the disk storage) in the output of fdisk. To allow use to take advantage of both paths, we needed to create an EMC power device (we are using EMC powerpth instead of dm-mulipath). This was accomplished by running powermt with the “config” option:

$ powermt config

Once the config operation completed, the power device was visible:

$ powermt display dev=emcpoweri
Pseudo name=emcpoweriCLARiiON ID=APM [Foo]Logical device ID=0987 [LUN 80 - DS Foo]state=alive;
policy=CLAROpt; priority=0; queued-IOs=0Owner: default=SP A, current=SP B

< ..... >

And available for general purpose use. I have bumped into numerous kernel bugs in the past that prevented me from dynamically discovering storage, so this was a welcome change. Having used both Qlogic and Emulex adaptors on Solaris and Linux hosts, I think I still prefer Emulex adaptors over Qlogic adaptors.
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Redhat recovery modes

f you have ever had to deal with a sick Redhat server, you may be familiar with the rescue, emergency and singler-user modes of operation. I have heard people refer to rescue modes incorrectly, which can sometimes lead to some interesting stories (there are several slight subtleties between them). To clear up any confusion surrounding these terms, here are the official descriptions from the Redhat administration guide:

Rescue mode:

Rescue mode provides the ability to boot a small Red Hat Enterprise Linux environment entirely from CD-ROM, or some other boot method, instead of the system’s hard drive. As the name implies, rescue mode is provided to rescue you from something. During normal operation, your Red Hat Enterprise Linux system uses files located on your system’s hard drive to do everything — run programs, store your files, and more.

Emergency Mode:

In emergency mode, you are booted into the most minimal environment possible. The root file system is mounted read-only and almost nothing is set up. The main advantage of emergency mode over single-user mode is that the init files are not loaded. If init is corrupted or not working, you can still mount file systems to recover data that could be lost during a re-installation.

Single-User mode:

In single-user mode, your computer boots to runlevel 1. Your local file systems are mounted, but your network is not activated. You have a usable system maintenance shell. Unlike rescue mode, single-user mode automatically tries to mount your file system. Do not use single-user mode if your file system cannot be mounted successfully. You cannot use single-user mode if the runlevel 1 configuration on your system is corrupted.

Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Linux kernel debugging

I came across OOPS! An Introduction to Linux Kernel Debugging while surfing the web, and found the presentation interesting. The information on sysinfo and sysrq was especially interesting, since these modules can be valuable tools for determing why a specific version of the Linux kernel decided to bite the dust!
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

x86 / linux boot process

here is quite a bit of documentation around the internet on the linux boot process, but Gustavo Duarte I think did an excellent job describing this in a clear and concise way. He also has several links to the Linux kernel source code and describes what is occurring step-by-step through the bootstrap phase all the way to the execution of /sbin/init.

His first entry lays the foundation of the basis of the x86 Intel chipset, memory map, and logical motherboard layout. This provides a basic understanding about the traditional hardware motherboard implementations.

Next, he describes BIOS initialization, and loading of the MBR. This briefly touches on the boot loader which starts the Linux bootstrap phase.

Finally, the kernel boot process is detailed with links to C and Assembly source code, with a brief narrative of exactly what is happening.

This was an awesome description of the early-on start up and initialization phases of hardware and bootstrapping of the O/S. Gustavo provides a great description of real-mode and protected-mode CPU states.

Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Generating core files from gdb

While debugging a problem a few weeks back, I needed to generate a core file from a hung process. I typically use the gcore utility to generate core files from running processes, but in this case I was already attached to the process with gdb, so gcore failed:

$ gcore 2575
ptrace: Operation not permitted.
You can’t do that without a process to debug.

Gak! I remembered reading about a gdb option that would dump core, so I wandered off to read through my gdb notes. Sure enough, gdb has a “generate-core-file” command to create a core file:

$ gdb -q - 2575
Attaching to process 2575Reading symbols from /usr/sbin/gpm...(no debugging symbols found)...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".Reading symbols from /lib/tls/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libm.so.6Reading symbols from /lib/tls/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libc.so.6Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2

0x0046e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

(gdb) generate-core-fileSaved corefile core.2575

(gdb) detachDetaching from program: /usr/sbin/gpm, process 2575

(gdb) quit

$ ls -al core.*-rw-r--r--  1 root root 2468288 Dec 11 13:49 core.2575

Nifty! I am starting to wonder if there is anything gdb can’t do
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Debugging a PAM LDAP password expiration problem



While testing out LDAP authentication on a CentOS 4.4 Linux host, I noticed that the “password” statements I added to /etc/pam.d/sshd weren’t taking effect:
password    requisite     /lib/security/$ISA/pam_cracklib.so retry=3
password    sufficient    /lib/security/$ISA/pam_unix.so nullok use_authtok md5 shadow
password    sufficient    /lib/security/$ISA/pam_ldap.so use_authtok
password    required      /lib/security/$ISA/pam_deny.so

After pondering the issue for a while, I eventually started to wonder if the “passwd” utility was called by sshd to change user passwords. To see if this was the case, I decided to expire a user’s password, and then strace sshd while I logged in as that user:

$ strace -f -e trace=execve -p 2616
Process 2616 attached - interrupt to quit--- SIGCHLD (Child exited) @ 0 (0) ---
Process 26638 attached[pid 26638] execve("/usr/sbin/sshd", ["/usr/sbin/sshd", "-R"], [/* 14 vars */]) = 0
Process 26639 attachedProcess 26639 detached[pid 26638] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 26640 attachedProcess 26641 attached[pid 26641] execve("/usr/bin/passwd", ["passwd"], [/* 14 vars */]) = 0
Process 26641 detached[pid 26640] --- SIGCHLD (Child exited) @ 0 (0) ---Process 26640 detached[pid 26638] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 26638 detached--- SIGCHLD (Child exited) @ 0 (0) ---Process 2616 detached

Sure enough, /usr/bin/passwd is called to change an expired password. To verify that the sshd daemon was the entity invoking /usr/bin/passwd, I used the strings utility to see if the string “/usr/bin/passwd” resided in the data segment of the sshd executable:

$ strings sshd | grep passwd
kerberosorlocalpasswd/usr/bin/passwdauth2-passwd.c%s: struct passwd size mismatchsshpam_passwd_convsshpam_auth_passwd

Once I knew that sshd called /usr/bin/passwd, I added my changes to /etc/pam.d/system-auth (which is “stacked” by pam_stack.so in /etc/pam.d/passwd), and everything worked as expected. I kinda dig the stacking capabilities that come out of the box with CentOS 4.4, since you can make a change in one location (/etc/pam.d/system-auth), and it’s effects are propogated to all service definitions in /etc/pam.d.
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Redhat Linux FTP client annoyance

The netkit-ftp client that ships with Redhat Enterprise Linux comes with a verbose option, which will among other things instruct the client to print the number of bytes transferred after each file is successfully sent. These messages look similar to the following:

85811076 bytes sent in 1.3e+02 seconds (6.7e+02 Kbytes/s)

I had several enormous files (each > 2GB) I needed to move to another server, and noticed that the netkit-ftp client wasn’t printing status messages after the files were transferred. To see what was causing the issue, I started reading throught the netkit-ftp source code. After a few minutes of poking around ftp.c, I came across this gem:
voidsendrequest(const char *cmd, char *local, char *remote, int printnames){  volatile long bytes = 0

while ((c = read(fileno(fin), buf, sizeof (buf))) > 0) {     printf("Bytes (%ld) is incremented by %d\n", bytes, c);
bytes += c;     for (bufp = buf; c > 0; c -= d, bufp += d)
if ((d = write(fileno(dout), bufp, c)) <= 0)
break;   ......}

I reckon the folks who developed this code never transferred files larger than 2^31 bits on 32-bit platforms. After changing bytes (and the code that uses bytes) to use the unsigned long long data type, everything worked as expected.
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Finding free space in Veritas diskgroups

The Veritas volume manager (VxVM) provides logical volume management capabilites across a variety of platforms. As you create new volumes, it is often helpful to know how much free space is available. You can find free space using two methods. The first method utilizes vxdg’s “free” option:

$ vxdg -g oradg free
GROUP        DISK         DEVICE       TAG          OFFSET    LENGTH    FLAGS
oradg        c3t20d1      c3t20d1s2    c3t20d1      104848640 1536      -
oradg        c3t20d3      c3t20d3s2    c3t20d3      104848640 1536      -
oradg        c3t20d5      c3t20d5s2    c3t20d5      104848640 1536      -
oradg        c3t20d7      c3t20d7s2    c3t20d7      104848640 1536      -
oradg        c3t20d9      c3t20d9s2    c3t20d9      104848640 1536      -

The “LENGTH” column displays the number of 512-byte blocks available on each disk drive in the disk group “oradg.”. If you don’t feel like using bc(1) to turn blocks into kilobytes, you can use vxassist’s “maxsize” option to print the number of blocks and Megabytes available:

$ vxassist -g oradg maxsize layout=concat
Maximum volume size: 6144 (3Mb)

Now to find out what to do with 3 MB of disk storage
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Printing VxVM DMP path information



In addition to providing volume management capabilities, the Veritas volume manager can manage multiple paths to a disk device. This allows I/O to be load-balanced across multiple paths, and ensures that I/O is transparently routed around failed paths. To print path information for a specific disk, you can use the “vxdisk” or “vxdmpadm” utilities:

$ vxdisk list c2t21d36
[ ... ]

Multipathing information:
numpaths:   4c2t21d36s2
state=enabledc2t23d36s2
state=enabledc3t20d36s2
state=disabledc3t22d36s2
state=disabled

$ vxdmpadm getdmpnode nodename=c2t21d36s2
NAME                 STATE     ENCLR-TYPE   PATHS  ENBL  DSBL  ENCLR-NAME
=========================================================================
c2t21d36s2           ENABLED   EMC          4      2     2     EMC0

$ vxdmpadm getsubpaths dmpnodename=c2t21d36
NAME         STATE         PATH-TYPE  CTLR-NAME  ENCLR-TYPE   ENCLR-NAME
====================================================================
c2t21d36s2   ENABLED        -        c2         EMC          EMC0
c2t23d36s2   ENABLED        -        c2         EMC          EMC0
c3t20d36s2   DISABLED       -        c3         EMC          EMC0
c3t22d36s2   DISABLED       -        c3         EMC          EMC0

The vxdisk(1m) and vxdmpadm(1m) output shows the number of paths to a disk device, and the current state of each path (e.g., enabled or disabled).
Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment

Default Disk Groups

Veritas Volume Manager comes with a wide variety of command line utilities, which can be used to create, delete and maintain Veritas objects. When operations are performed with the CLI and no disk group is passed to the “-g” (disk group to use) option, the command will default to using the value assigned to defaultdg. This value of defaultdg is stored in the /etc/vx/volboot file:

$ grep defaultdg /etc/vx/volboot
defaultdg oradg

If you would like to change the default disk group, you can use vxdctl(1m)’s “defaultdg” option:

$ vxdctl defaultdg oof

To verify that the value was changed, you can run vxdg(1m) with the “defaultdg” option:

$ vxdg defaultdg
oof

This can save a lot of typing when creating new Veritas objects!

Blogmarks BlogMemes BlogLines del.icio.us de.lirio.us Digg Facebook Google Google Reader LinkaGoGo Ask.com MyStuff Ask.com Yahoo! MyWeb Netscape Sphere StumbleUpon Plugin by Dichev.com

Leave a Comment