Virtual Machine Images

Overview

Orca-based installations require separate filesystem/kernel/ramdisk images (not a single bootable filesystem image). It is highly recommended to add NEuca guest extensions to your image. Orca-based installations rely on KVM hypervisor so images must be compatible with it. The most straightforward way to build an image is to start with a 'pre-cursor' image from OpenStack, Eucalyptus or EC2 and modify it to your needs.

Workflow

  1. Start with existing image and modify it, for example using OpenStack instructions on how to modify images or see below.
  2. Install NEuca tools into the image
  3. Make the image sparse (see below)
  4. Test if it boots
  5. Create ImageProxy image descriptor file and calculate the hash (see below)

Resizing an existing image filesystem

Often, a stock image will provide an adequate starting point for an experimenter, but will lack sufficient "free space" to be useful. An experimenter can, however, resize the filesystem component of an image by first performing these commands:

$ dd if=/dev/zero of=$NEW_IMAGE_FILE bs=1M count=$DESIRED_SIZE_OF_NEW_IMAGE_IN_MB
$ dd if=$OLD_IMAGE_FILE of=$NEW_IMAGE_FILE conv=notrunc

and then following these commands with the appropriate filesystem resize command; for the Linux ext2, ext3, or ext4 filesystem, this command would be:

$ resize2fs $NEW_IMAGE_FILE

It is recommended, but not required, that you run fsck on the image after completing this process:

$ e2fsck -f $NEW_IMAGE_FILE

Note - the above procedure works well for increasing the size of an image; shrinking an image is slightly more complex, in that it requires running the filesystem resize command upon the image before transitioning it into a smaller file.

Sparse Images

The astute reader might point out that tarring an individual file is unnecessary and an odd format for ORCA to accept. However, when used appropriately, this functionality can reduce the time and space necessary to stage and boot a virtual machine to a small fraction of the original. The key is to store images as sparse files. Sparse files take up less space when uncompressed and can be take less time to be uncompressed.

It is not too difficult to create sparse files but it can be tricky to keep the files sparse while handling them. Many common tool cannot process a sparse file without eliminating the sparse property (i.e. gzip, bzip2). Other tools can maintain sparsity but do not by default (i.e. cp, tar). Further, some operating systems and file systems cannot handle sparse files (i.e. nfs).

Create a Sparse Image

This section assumes that you have a working non-sparse image and want to convert it to a sparse image.

Create an appropriately sized sparse file (make sure your operating system and file system support sparse files... don't use nfs):

$  cp --sparse=always my_image.img  my_sparse_image.img

You now have a sparse image called my_sparse_image.img. Don't mess it up by using any tools that remove the sparsity.

Tar the file with -S to preserve sparsity:

$ tar -S -zcvf  my_sparse_image.img.tgz my_sparse_image.img

Note: to use the image with ORCA, must rename the image "filesystem" before tarring:

$ cp --sparse=always my_image.img  filesystem
$ tar -S -zcvf  my_sparse_image.img.tgz filesystem

ImageProxy image metafiles

Virtual Machine images can be made available to ORCA by placing them (and their metadata) at well known URLs (i.e. on your webserver). An ORCA service called ImageProxy will manage staging and caching all images at the appropriate cloud site(s). ImageProxy identifies each image by its URL and the SHA-1 hash of the image and requires an xml metadata file with the identifying information.

Sample metadata file. Node element names and keywords representing image type are case sensitive:

<images>
	<image>
		<type>FILESYSTEM</type>
		<signature>FS_IMAGE_HASH</signature>
		<url>http://url_filesystem_image</url>
	</image>
	<image>
		<type>KERNEL</type>
		<signature>KERNEL_IMAGE_HASH</signature>
		<url>http://url_kernel_image</url>
	</image>
	<image>
		<type>RAMDISK</type>
		<signature>RAMDISK_IMAGE_HASH</signature>
		<url>http://url_ramdisk_image</url>
	</image>
</images>

The metadata file itself also needs to be hosted. Provided with the URL for the metadata file and its signature (SHA1 hash), ImageProxy associated with a cloud site can download and install the required images and provide ORCA with site-specific image ids (AMI/EMI, AKI/EKI, ARI/ERI) that can be used to create guest VMs on that site.

The documentation for the latest metafile format is maintained on the ImageProxy wiki.

Metafiles for Compressed Images

The image provider can reduce the amount of data transfered across the network by compressing the virtual machine images and modifying the metadata file to indicate that the image is compressed. The file system type should be set to "ZFILESYSTEM".

<images>
	<image>
		<type>ZFILESYSTEM</type>
		<signature>FS_IMAGE_HASH</signature>
		<url>http://url_to_compressed_filesystem_image</url>
	</image>
	<image>
		<type>KERNEL</type>
		<signature>KERNEL_IMAGE_HASH</signature>
		<url>http://url_kernel_image</url>
	</image>
	<image>
		<type>RAMDISK</type>
		<signature>RAMDISK_IMAGE_HASH</signature>
		<url>http://url_ramdisk_image</url>
	</image>
</images>

Currently, ZFILESYSTEM can handle .gz, .bz2, and .xz. images compressed with these tools can have any name.

In addition, ZFILESYSTEM can handle images that are tarred before they are compressed with the tools above extending the allowed formats to include .tgz, .tbz2, and .txz. However, images that are tarred and gzipped must conform to the expected naming convention by naming the image file "filesystem". The tarball can have any name.

Create a metadata file

This section assumes you have a working image, kernel, and ramdisk.

Get the SHA-1 hash of the image, kernel, and ramdisk:

$ sha1sum my_sparse_image.img.tgz
6731f64297c725276758406927b803086e946cbf  /home/pruth/my_sparse_image.img.tgz
$ sha1sum my_kernel
fe919c7cb7478d0e36a923dfc65895d8c90db426  /home/pruth/my_kernel
$ sha1sum my_ramdisk
5835285746eca5373c53231522851e52d62e2332  /home/pruth/my_ramdisk

Put the files on a web server and create the metadata file:

<images>
	<image>
		<type>ZFILESYSTEM</type>
		<signature>6731f64297c725276758406927b803086e946cbf</signature>
		<url>http://url_to_compressed_filesystem_image/my_sparse_image.img.tgz</url>
	</image>
	<image>
		<type>KERNEL</type>
		<signature>fe919c7cb7478d0e36a923dfc65895d8c90db426</signature>
		<url>http://url_kernel_image/my_kernel</url>
	</image>
	<image>
		<type>RAMDISK</type>
		<signature>5835285746eca5373c53231522851e52d62e2332</signature>
		<url>http://url_ramdisk_image/my_ramdisk</url>
	</image>
</images>

Get the SHA-1 hash of the metadata file:

$ sha1sum my_metadata_file
1956f27f43ac1d02aa9947aca690ff2b329e6730  /home/pruth/my_metadata_file

Put the metadata file on a web server. Use the url and hash for the metadata file in a tool like Flukes to create a virtual machine on ORCA.

You can use this handy script to generate the metadata file and all the signatures. It needs to be executed from the webserver on which the images have been placed from the directory where they are located. Here is a typical invocation:

[user@host]$ ~/build_image_file.sh -z debian-squeeze-amd64-neuca-2g-sparse.img.tgz -k kvm-kernel/vmlinuz-2.6.28-11-generic -r kvm-kernel/initrd.img-2.6.28-11-generic -u http://geni-images.renci.org/images/standard/debian -n autofile.xml
Testing file presence of debian-squeeze-amd64-neuca-2g-sparse.img.tgz [OK]
Testing reachability over HTTP of debian-squeeze-amd64-neuca-2g-sparse.img.tgz [OK]
Testing file presence of kvm-kernel/vmlinuz-2.6.28-11-generic [OK]
Testing reachability over HTTP of kvm-kernel/vmlinuz-2.6.28-11-generic [OK]
Testing file presence of kvm-kernel/initrd.img-2.6.28-11-generic [OK]
Testing reachability over HTTP of kvm-kernel/initrd.img-2.6.28-11-generic [OK]
Creating XML image descriptor file autofile.xml
Testing file presence of autofile.xml [OK]
Testing reachability over HTTP of autofile.xml [OK]


XML image descriptor file SHA1 hash is: 61b6740057e22cd64586becbb4de69b402199d81
XML image descriptor file URL is: http://geni-images.renci.org/images/standard/debian/autofile.xml

You can execute the script with '-h' option to find out more about it.

Attachments