Wednesday, May 18, 2016

Installing Tribblix into an existing pool

The normal installation method for Tribblix is the live_install.sh script.

This creates a ZFS pool for installation into, creates file systems, copies the OS, adds packages, makes a few customizations, installs the bootloader, and not much else. It's designed to be simple.

(There's an alternative, to install to a UFS file system. Not much used, but it's kept just to ensure no insidiuous dependencies creep into the regular installer, and is useful for people with older underpowered systems.)

However, if you've already got an illumos distro installed, then you might already have a ZFS pool, and it might also have some useful data you would rather not wipe everything out. Is there not a way to create a brand new boot environment in the existing pool and install Tribblix to that, preserving all your data?

(Remember, also, that  ZFS encourages the separation of OS and data. So you should be able to replace the OS without disturbing the data.)

As of Tribblix Milestone 17, this will work. Booting from the ISO and logging in, you'll find a script called over_install.sh in root's home directory. You can use that instead of live_install.sh, like so:

./over_install.sh -B rpool kitchen-sink

You have to give it the name of the existing bootable pool, usually rpool. It will do a couple of sanity checks to be sure this pool is suitable, but will then create a new BE there and install to that.

Arguments after the pool name are overlays, specifying what software to install, just like the regular install.

It will update grub for you, so that you have a grub on the pool that is compatible with the version of illumos you've just added. With -B, it will update the MBR as well.

It copies some files, the minimum that define the system's identity, from the existing bootable system into the new BE. This basically copies across user accounts (group, passwd, shadow files) and the system's ssh keys, but nothing else.

Any existing zfs file systems are untouched, and will be present in the new system. You'll have to import any additional zfs pools, though.

When you boot up after this, the grub menu will just contain the new BE you just created. However, any old boot environments are still present, so you can still see them, and manipulate them, using beadm. In particular, you can mount up and old BE (in case there are important files you need to get back), and activate an old BE so you can boot into the old system if so desired.

Amongst other things, you can use this as a recovery tool, when your existing system has a functioning root pool but won't boot.

I've also used this to "upgrade" older Tribblix systems. While there is an upgrade mechanism, this really only works (a) for very recent releases, and (b) to update one release at a time. With this new mechanism, I can simply stick a new copy of Milestone 17 on an old box, and enjoy the new version while having all my data intact.

Tuesday, May 17, 2016

Updating Tribblix for SPARC

Having just released an updated version of Tribblix for x86, a little commentary on the status of the SPARC version is probably in order.

After a rather extended delay, there is an updated version of Tribblix for SPARC available for download.

This is Milestone 16, so it's at the same level as the prior x86 release. More precisely, it's built from exactly the same illumos source as the x86 Milestone 16 release was. Yes, this means that it's a bit dated, but it's consistent, and there have been a number of breaking changes for SPARC builds introduced (and fixed) in the meantime.

I did want to get a release out of the door, before bumping the version to 0m17 ready for the next x86 build. Part of this was simply to ensure that I could actually still build a release - it gets tested on x86 all the time, but as it happens I had never built a SPARC release on my current infrastructure.

In terms of additional packages, the selection is still rather sparse. I haven't had the time to build up the full list. This isn't helped by the fact that my SPARC kit is rather slow, so building anything for SPARC simply takes longer. Much longer. (Not to mention the fact that they're noisy and power-hungry.)

It's not just time that's the problem. I've had some difficulty building certain packages. I'm not talking the likes of Go and Node, which I can simply ignore as they're not ported to SPARC at all, but some reasonably common packages would fail with obscure (and unexpected) build errors. If there's a problem with one component, that blocks anything dependent on it too.

Other than expanding the breadth of available packages, the key next steps are (i) to make Tribblix on SPARC self-hosting, like the x86 version has been for a very long time, and (ii) to try and keep the SPARC release more closely aligned so it doesn't drift away from the x86 release and require additional effort to bring back in sync.

Signing Packages in Tribblix

On any computer system you want to know exactly what software is installed and running.

Tribblix uses SVR4 packaging, so you can easily see what's installed. In addition, there are mechanisms - pkgchk - to compare what's on the disk with what the packaging system thinks should be there. But that's just a consistency check, it doesn't verify that the package installed is actually the one you wanted.

Tribblix has had simple integrity checking for a while. The catalog for a package repository includes both the expected size and the md5 checksum of a package. This is largely aimed at dealing with download errors - network drops, application errors, or errant intrusion detection systems mangling the data. In practice, because the downloaded packages are actually zip files, which have inbuilt consistency checking and the catalog at the end of the file, and because SVR4 packaging has its own consistency checks on package contents, the chances of a faulty download getting installed are remote, the checking is so that the layer above can make smart decisions in the case of failure.

But you want to be sure that, not only has the package you downloaded made it across the network intact, but that the source package is legitimate. So the packages are signed using gnupg, and will be verified upon download in upcoming releases. Initially this is just a warning check while the mechanisms get sorted out.

The actual signing and verification part is the easy bit, it's all the framework around it that takes the time to write and test.

One possibility would have been to sign the package catalogs, and use that to prove that the checksum is correct. That's not enough, for a couple of reasons. First, the catalog only includes current package versions, so there would be no way to verify prior versions. Second, there's no reason somebody (or me) couldn't take a subset of packages and create a new repo using them; the modified catalog couldn't be verified. In either case, you need to be able to verify individual packages. (But the package catalog should also be signed, of course.)

It turns out there's not much of a performance hit. Downloads are a little slower, because there's an extra request to get the detached signature, but it's a tiny change overall.

With this in place, you can be sure that whatever you install on Tribblix is legitimate. But all you're doing is verifying the packages at download time. This leaves open the problem of being able to go to a system and ask whether the installed files are legitimate. Yes, there's pkgchk, but there's no validated source of information for it to use as a reference - the contents file is updated with every packaging operation, so it clearly can't be signed by me each time.

This is likely to require the additional creation of a signed manifest for each package. This partially exists already, as the pkgmap fragments for each package are saved (in the global zone, anyway), and those could be signed (as they don't change) and used as the input to pkgchk. However, the checksums in the pkgmap and contents files aren't particularly strong (to put it mildly), so that file will need to be replaced by something with much stronger checksums.

Initial support for signed packages is available starting with the Tribblix Milestone 17 release. At this point, it will check the package signatures, but not act on them, enforcement will probably come in the next release when I can be reasonably sure that everything is actually working correctly.