Version 45 (modified by shuang, 9 years ago)


Setting up xCAT 2.4.3 on DELL PE 860


  1. Redhat or similar is preferred.
  2. only eth0 is needed, eth1 is left for ORCA to configure. Head node's eth0 ( is connected to Internet. Compute nodes do not need to have access to internet, everything can be managed by the head node.

xCAT Installation

xCAT installation on CentOS 5.5 is simple:

  1. # cd /etc/yum.repos.d
  2. # wget
  3. # wget
  4. # yum clean metadata
  5. # yum install xCAT.x86_64

xCAT Configuration

The configuration of xCAT is all about managing the tables. Run

tabdump -d

for an introduction of all tables (can be found as /etc/xcat/*.sqlite). Run

tabdump -d $table

for an introduction of $table's elements. First, run

tabdump site

to verify xCAT is correctly installed. You should see the default values for site. Next, we need some modifications to the site table. There are two ways to modify a table, one is to use chtab, the other is to use tabedit. Our site looks like this:


Note: Both "master" and "nameservers" point to, IP address of head node eth0's. This interface also provides dhcpd service to all compute nodes ("dhcpinterfaces"=>"eth0"). "forwarders" points to our local name server.

Network Configuration

We need to provide DNS and DHCP service for compute nodes. They can be run on a different machine. In our example, both are run on the head node (master).

  1. First, modify networks table. We need to make some changes from the default that xCAT set up:
    1. a dynamic range for mac address discovery.
    2. Name servers, DHCP servers and TFTP servers - All of them are the master node.

Mine looks like this:

 #tabdump networks
  1. Configure DNS service. Edit /etc/hosts file:
     # Do not remove the following line, or various programs
     # that require network functionality will fail.               localhost.localdomain localhost
     ::1             localhost6.localdomain6 localhost6  mgt.renci.ben mgt
     #n01 ip  n01 n01.renci.ben
     #n01 ipmi ip  n01-ipmi n01-ipmi.renci.ben
     #switch ip  linksys-mgt.renci.ben
    Note: you don't have to edit this file manually. Instead, run
     #tabedit hosts
    to fill in all the node information, then run
    to populate node info to /etc/hosts. Next, setup dns:
     #service named start
    Note that only entries in /etc/hosts that are part of a network in networks table are added to dns by makedns. We also need to config /etc/resolv.conf for nslookups, mine looks like this:
     search renci.ben
    To test it is setup correctly, run
     #host n01
    make sure it's resolved to the right IP address ( in our case).
  1. Next, run the following commands to create /etc/dhcpd.conf for dhcpd service
     #makedhcpd -n
     #service dhcpd restart
  1. To enable node discovery, we need to configure the switch so the following command works:
     $snmpwalk -v 1 -c public switch_ip

Node Configuration

  1. To fill in node information, we need to edit nodelist table. Mine looks like:
     #tabdump nodelist
     "n01","compute,all","netbooting","09-03-2010 15:18:13",,,,,
    Note that you only need to specify the node and groups, the status and statustime is populated automatically, it will not reflect the real status if you change it outside of xCAT (e.g. press the power button to turn it off instead of using xCAT's ipmi). In xCAT group is somewhat similar with the concept in Linux. It's convenient in the sense that you can operate commands on a group without specifying each individual member. Run the following to verify nodelist works:
     #nodels compute
  1. Next, we setup the node hardware management table, i.e., nodehm. Mine looks like this:
    Note that we will configure ipmi to allow LAN access. Some people may need to use console over LAN access -- since we only have IPMI 1.5, this is not covered here.

BMC/IPMI Configuration

  1. First, when you installed xCAT, an ipmitool package comes with it. Make sure you are not using ipmitool from other sources. You don't need to ipmitool in most cases, but it's important to know if you have troubles. Obviously DELL's ipmi implementation is different from IBM's. To configure IPMI, you can either do it at run time using ipmitool or at boot time (Ctrl-E on DELL PE 860). Read [] for details. Mine looks like this:
     # ipmitool -I lan -H -U root -a lan print 1
     Set in Progress         : Set Complete
     Auth Type Support       : NONE MD2 MD5 PASSWORD 
     Auth Type Enable        : Callback : MD2 MD5 PASSWORD 
                            : User     : MD2 MD5 PASSWORD 
                            : Operator : MD2 MD5 PASSWORD 
                            : Admin    : MD2 MD5 PASSWORD 
                            : OEM      : MD2 MD5 
     IP Address Source       : Static Address
     IP Address              :
     Subnet Mask             :
     MAC Address             : 00:18:8b:f8:e3:58
     SNMP Community String   : public
     IP Header               : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
     Default Gateway IP      :
     Default Gateway MAC     : 00:00:00:00:00:00
     Backup Gateway IP       :
     Backup Gateway MAC      : 00:00:00:00:00:00
     802.1q VLAN ID          : Disabled
     802.1q VLAN Priority    : 0
     Cipher Suite Priv Max   : Not Available

Dell's IPMI implementation seems to be different from IBM's in the sense that once a session is established, it does not return an auth type in the response (Thanks Jarrod Johnson for help!). In /opt/xcat/lib/perl/xCAT/, this response packet is dropped by the following code:

 if ($rsp[4] != $self->{authtype}) {
     return 2; # not thinking about packets that do not match our preferred auth type

To solve this problem we commented the 2nd line out.

  1. Next, we need to setup the ipmi table like this:
     # tabdump ipmi
    Note that the regular expression appends -ipmi to host name (e.g. n01-ipmi). Make sure your dns is setup correctly to resolve the host name. Also, the username/password will overwrite what's in passwd table.
  1. Important for PE 860 When you do a netboot or net installation, you probably have to press F12 manually so the machine enters PXE boot mode first -- nodeset netboot/ipmitool chassis bootdev pxe will not do the work it's supposed to do! I think Dell did not enable this feature until 1950.

PXE configuration

  1. The node resources table (noderes) is used to specify the resources and settings when installing compute nodes. We use PXE, then the table looks like:
     # tabdump noderes
  1. To setup the TFTP server for PXE, we prepare the source by running:
     #copycds /dev/dvd 
    Note: assume that you have the Linux DVD in /dev/dvd, this command will copy every thing to /install. Or, if you downloaded the DVD iso, say, CentOS-5.5-bin-DVD.iso, you can run
     #copycds CentOS-5.5-bin-DVD.iso
    Then, run:
     #mknb x86_64
    to setup the TFTP server (the default directory is /tftpboot). To install compute nodes, run:
     #rinstall compute
  1. To create a net boot image, run:
     #./genimage -i eth0 -n tg3,bnx2 -o centos5.5 -p compute
     #cd /install/netboot/centos5.5/x86_64/compute/rootimg/etc/
     #cp fstab fstab.old
     #echo "compute_x86_64 / tmpfs rw 0 1
            none /tmp tmpfs defaults, size=10m 0 2
            none /var/tmp tmpfs defaults, size=10m 0 2" >> fstab
     #packimage -o centos5.5 -p compute -a x86_64
    Test it by running:
     #nodeset n01 netboot
     #rpower n01 boot
  1. The genimage command invokes genintrd automatically to generate the initrd for netboot. However, when the compute nodes boot up, the following error is returned:
     Kernel panic: no init found. Try passing init= option to kernel
    To modify the initrd, do:
    1. sudo -s
    2. mkdir temp
    3. cd temp
    4. cat /tftpboot/xcat/netboot/centos5.5/x86_64/compute/initrd.gz| gzip -d | cpio -i
    5. less init, the first line is
       !/sbin/busybox.anaconda sh
    6. ldd sbin/busybox.anaconda
    7. copy all the missing libs to lib64 or lib, take care of symbol links separately
    8. find ./ | cpio -H newc -o > ../initrd
    9. gzip initrd
    10. cp initrd.gz /install/netboot/centos5.5/x86_64/compute/initrd.gz
    11. nodeset compute netboot