Tuesday, December 27, 2016

Minimal illumos zones

Zones, meet MVI. MVI, meet Zones.

In Tribblix, zones can be the traditional Solaris sparse-root or whole-root style, or variations such as partial-root or alien-root. There's also the option to boot a blank zone - one in which nothing (or as close to nothing as possible) is running.

In parallel development, minimal viable illumos allows you to boot illumos in 48M of RAM, or to build single purpose bootable images.

So what happens if you combine these strands of thought? Minimal illumos zones, that's what.

The general idea here is that you can use the (new) zmvix.sh script in mvi to build a tarball containing a filesystem image. This image is designed for use in zones, so contains none of the kernel components. And there's no point building an ISO image, as it never needs to be bootable of itself.

The alien-root brand in Tribblix was originally designed to build a zone from an installation ISO. The minimal zone is a similar concept, although quite a bit simpler. Unpacking a tarball is far more direct that dissecting a bootable ISO. Furthermore, it's not necessary to undo the live media customizations present on an installation image. So the zone installer just has a simple branch to a tarball unpacker or the iso unpacker depending on filename.

The whole premise of mvi is that it's minimal. However, what counts as the bare minimum depends on context.

For example, a zone whose networking is provided via a shared-ip stack has no need for networking tools, as all the networking is configured for it by the global zone. So that's a major potential simplification.

On the other hand, getting zlogin working was a bit of a challenge. The first problem is that you need getent to be present in the zone. This is defined by the user_cmd element of the zone brand's config.xml file. So my zmvix.sh script explicitly adds /usr/bin/getent to the image. That's enough to get zlogin -S to work.

A full zlogin is a bit more work. That calls /usr/bin/login, which has a bunch more dependencies, including a number of pam modules. The list of files needed a bit of trial and error to obtain. So you can make a full zlogin work, but you don't need to.

While I was doing this I had a look through the zlogin source, and to say it's a massive kludge is a bit of an understatement. And when I read comments like:
It's truly amazing that there is no library function in OpenSolaris to do this for us.
Then I get alarmed. There is truly weird stuff going on here, and I'm clearly not supposed to understand it.

The result of all this, if you create an image from mvi with the command:

./zmvix.sh nonet node

then you end up with an 11M image file, which I can use to create a zone with

zap create-zone -z zmvi -t alien \
-I /var/tmp/zmvi.tar.gz \
-i 192.168.1.234

and if you point a browser at the zone's IP address, port 8000, you get back the page from the node server.

You can do this yourself if you check out mvi and are running a fully updated Tribblix m18.

In all, there are 4 processes associated with the zone. There's zsched, init, and the console shell, plus node. That's it.

Of course, this isn't the only way to do it. Another option would be to use the partial-root zone installer and get it to construct the zone's filesystem image the same way that mvi does, bypassing the tarball creation and unpacking.

Sunday, December 04, 2016

Tribblix and the new illumos loader

Recently, a new boot loader was added to illumos, which will in time replace the old and venerable grub that we've been using for about a decade.

I've been looking at how this will impact Tribblix.

The boot loader's arrival was heralded long in advance. I actually released Tribblix milestone 18 when I did to ensure I didn't have to deal with any loader issues. Not that I was expecting any issues, but just in case.

The first step in looking at the impact of the new loader was to build a current copy of illumos. I had a couple of issues due to recent illumos changes. The first being that the transition to Python 2.7 didn't work with my copy of python (I need to build a dual 32/64 bit installation) so I used the old copy of python 2.6. The second was that the loader wants /usr/sfw/bin/gstrip, which I've never had, but a quick symlink set that straight.

The loader is a new package. The first thing I tried was to build an ISO exactly as I always have. This ISO knows nothing about the new loader, doesn't have the loader present, and uses grub just as it always has. If you pretend the new loader doesn't exist, everything just works the way it did before. That's encouraging as a fallback position

Next step was to add the package for the new loader, and persuade the ISO to boot from it. This was very easy, you just need to change the path to the boot image when calling mkisofs. For grub, it was

-b boot/grub/stage2_eltorito

and for the new loader it becomes

-b boot/cdboot

That should be it, but it then tripped up on a Tribblix customization. The loader needs to know where the kernel and the boot archive are. The defaults are reasonable, but use $ISADIR to pick up a 32 or 64-bit image as required. On live media, Tribblix has a single merged boot archive, so I need to override the boot_archive_name to not use $ISADIR. So I create a file /boot/loader.conf.local that contains



boot_archive_load="YES"
boot_archive_type="rootfs"
boot_archive_name="/platform/i86pc/boot_archive"

boot_archive.hash_load="NO"
boot_archive.hash_type="hash"
boot_archive.hash_name="/platform/i86pc/${ISADIR}/boot_archive.hash"


and then make sure that I delete that file on the installed image, where things will look like a regular system again.

Thinking about this, it would have been more sensible to drop a file into /boot/conf.d which is another location that the loader uses for customization. I use this for something else, I create a file /boot/conf.d/chaindisk containing

chain_disk="disk0:"

and the loader menu will have a "boot from hard disk" entry, which I think you do need on media. Again, this gets deleted from the installed system where it doesn't make any sense.

Something else you can do is tweak the branding. I've played with changing the illumos name on the boot screen with Tribblix (look at the ascii art in /boot/forth/brand-illumos.4th for example).

To make the installed system bootable used to involve messing with installgrub, now bootadm can manage it for you. That's just

/sbin/bootadm install-bootloader -M -P rpool

and it should handle pools with multiple drives correctly.

The only other thing the installer needs to do, as far as I can tell, is initialize the list of boot environments. This is similar to grub, and involves putting 2 lines into /rpool/boot/menu.lst, for example

title Tribblix 0.19
bootfs rpool/ROOT/tribblix

and there you are. Some relatively simple changes and Tribblix is ready to use the new loader.

Well, almost. This needs to be packaged up and polished, and I still need to change and test the UFS installer, SPARC builds, and installation into an existing pool.