Version 25 (modified by shuang, 9 years ago)

--

Setting up xCAT 2.4.3 on DELL PE 860

Note: only eth0 is needed, eth1 is left for ORCA to configure. Head node's eth0 (192.168.201.49) is connected to Internet. Compute nodes do not need to have access to internet, everything can be managed by the head node.

xCAT Installation

xCAT installation on CentOS 5.5 is simple:

  1. # cd /etc/yum.repos.d
  2. # wget http://xcat.sourceforge.net/yum/xcat-core/xCAT-core.repo
  3. # wget http://xcat.sourceforge.net/yum/xcat-dep/rh5/x86_64/xCAT-dep.repo
  4. # yum clean metadata
  5. # yum install xCAT.x86_64

xCAT Configuration

The configuration of xCAT is all about managing the tables. Run

tabdump -d

for an introduction of all tables (can be found as /etc/xcat/*.sqlite). Run

tabdump -d $table

for an introduction of $table's elements. First, run

tabdump site

to verify xCAT is correctly installed. You should see the default values for site. Next, we need some modifications to the site table. There are two ways to modify a table, one is to use chtab, the other is to use tabedit. Our site looks like this:

#key,value,comments,disable
"blademaxp","64",,
"domain","renci.ben",,
"fsptimeout","0",,
"installdir","/install",,
"ipmimaxp","64",,
"ipmiretries","3",,
"ipmitimeout","2",,
"consoleondemand","no",,
"master","192.168.201.49",,
"maxssh","8",,
"ppcmaxp","64",,
"ppcretry","3",,
"ppctimeout","0",,
"sharedtftp","1",,
"SNsyncfiledir","/var/xcat/syncfiles",,
"tftpdir","/tftpboot",,
"xcatdport","3001",,
"xcatiport","3002",,
"xcatconfdir","/etc/xcat",,
"timezone","America/New_York",,
"useNmapfromMN","no",,
"nameservers","192.168.201.49",,
"ntpservers","mgt",,
"forwarders","192.168.201.254",,
"dhcpinterfaces","eth0",,

Note: Both "master" and "nameservers" point to 192.168.201.49, IP address of head node eth0's. This interface also provides dhcpd service to all compute nodes ("dhcpinterfaces"=>"eth0"). "forwarders" points to our local name server.

Network Configuration

First, edit /etc/hosts file:

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
192.168.201.49  mgt.renci.ben mgt
#n01 ip
192.168.201.12  n01 n01.renci.ben
#n01 ipmi ip
192.168.201.77  n01-ipmi n01-ipmi.renci.ben
...
#switch ip
192.168.201.76  linksys-mgt.renci.ben

Note: you don't have to edit this file manually. Instead, run

#tabedit hosts

to fill in all the node information, then run

#makehosts

to populate node info to /etc/hosts. Next, setup dns:

#makedns
#service named start

We also need to config /etc/resolv.conf for nslookups, mine looks like this:

search renci.ben
nameserver 192.168.201.49

To test it is setup correctly, run

host n01

make sure it's resolved to the right IP address (192.168.201.12 in our case).

IPMI Configuration

Netboot image

The genimage command invokes genintrd automatically to generate the initrd for netboot. However, when the compute nodes boot up, the following error is returned:

Kernel panic: no init found. Try passing init= option to kernel

To modify the initrd, do:

  1. sudo -s
  2. mkdir temp
  3. cd temp
  4. cat /tftpboot/xcat/netboot/centos5.5/x86_64/compute/initrd.gz| gzip -d | cpio -i
  5. less init, the first line is
    !/sbin/busybox.anaconda sh
    
  6. ldd sbin/busybox.anaconda
  7. copy all the missing libs to lib64 or lib, take care of symbol links separately
  8. find ./ | cpio -H newc -o > ../initrd
  9. gzip initrd
  10. cp initrd.gz /install/netboot/centos5.5/x86_64/compute/initrd.gz
  11. nodeset compute netboot

Attachments