Monday, November 06, 2017

Selecting relay smarthosts and using SMTP AUTH on illumos

A problem I looked at recently involved configuring a system to send (relay) email via a customer's own SMTP servers. There are 2 parts to this:

  • Select the relay host depending on some condition
  • Authenticate with the remote relay using SMTP AUTH

Search for SMTP AUTH with sendmail on illumos or Solaris, and you invariably end up with advice on how to build Cyrus SASL and sendmail from scratch.

For example, Andrew has some good instructions.

However, if you look at the sendmail we ship on illumos you'll find that it's already been built with SASLv2 support:

# /usr/lib/sendmail -bt -d0.1 < /dev/null
Version 8.14.4+Sun
 Compiled with: DNSMAP LDAPMAP LOG MAP_REGEX MATCHGECOS MILTER MIME7TO8
        MIME8TO7 NAMED_BIND NDBM NETINET NETINET6 NETUNIX NEWDB NIS
        PIPELINING SASLv2 SCANF STARTTLS TCPWRAPPERS USERDB
        USE_LDAP_INIT XDEBUG

And, if you telnet to port 25 and look at the EHLO response it includes:

250-AUTH GSSAPI DIGEST-MD5 CRAM-MD5

However, that's not actually the part we want here (but I'll come back to that later). I don't want to authenticate against my own server, I need my system to authenticate against a remote server.

Back to the problem at hand.

The first part - selecting the right smarthost - can be achieved using smarttable. All you need is the smarttable.m4 file, and then build a configuration using it by enabling the smarttable feature.

The second part, SMTP AUTH, should also be very simple. Again, it's all documented, and just involves enabling the authinfo feature. But wait - on illumos, there is no authinfo.m4 file, so that won't work.

In fact, it does. So what you need to do is to download the sendmail source, unpack it, and there in the cf/feature directory you'll find the authinfo.m4 file.

OK, so copy both files - smarttable.m4 and authinfo.m4 - into the /etc/mail/cf/feature directory on a server. Copy and edit the sendmail.mc file (i'm going to copy it to /tmp and edit it there) to add the 2 feature lines, like this fragment of the file here:

...
define(`confFALLBACK_SMARTHOST', `mailhost$?m.$m$.')dnl
FEATURE(`authinfo')dnl
FEATURE(`smarttable')dnl
MAILER(`local')dnl
...

Basically, just add the features above the MAILER line. Then compile that:

cd /etc/mail/cf/cf
m4 ../m4/cf.m4 /tmp/sendmail.mc > /tmp/sendmail.cf

That's your new sendmail.cf ready. It uses 2 databases in /etc/mail, to create these (initially empty):

cd /etc/mail
touch smarttable
touch authinfo
makemap hash smarttable < smarttable
makemap hash authinfo < authinfo

then copy your new sendmail.cf into /etc/mail and restart sendmail

cp /tmp/sendmail.cf /etc/mail
svcadm restart sendmail

So far so good, but what should those files look like?

First the smarttable file, which is just a map of sender to relay host. For example, it might just have:

my.name@gmail.com smtp.gmail.com

Which means that if I want my home system to send out mail with my address on it, it should route it through gmail's servers rather than trying to deliver it direct (and likely getting marked as spam).

Then the authinfo file, which looks like

Authinfo:smtp.gmail.com "U:root" "I:my.name@gmail.com" "P:mypassword" "M:LOGIN PLAIN"
Authinfo:smtp.gmail.com:587 "U:root" "I:my.name@gmail.com" "P:mypassword" "M:LOGIN PLAIN"
(There are just 2 lines there, starting with Authinfo:, even if the blog shows it wrapped.)

Basically, for gmail, you need to supply your email address as the identifier and your password as, well, the password. (Note: if you've got two-factor authentication set up, you'll need to set up an app key.)

Of course, the authinfo files ought to to readable only by root, otherwise anyone one your system can read your password in the clear.

There are a couple of non-standard tweaks you'll need for gmail to work. First, you need to go to your gmail account settings and allow less secure apps. Second, you will need the "M:LOGIN PLAIN" entry in the authinfo file, else you'll get an "available mechanisms do not fulfill requirements" error back.

Redo the two makemap commands above and you're good to go.

That's SMTP AUTH the one way. At which point you're probably thinking, can we authenticate against an illumos sendmail using SMTP AUTH?

The answer, sadly, is no. At least as far as I can tell. While our sendmail is built correctly against SASLv2, illumos doesn't seem to ship enough supporting bits of the SASL infrastructure to make this work. You should be able to create the file /etc/sasl/Sendmail.conf to configure it. Unfortunately the only pwcheck_method available is auxprop (using shadow, which would allow you to authenticate against local system accounts, isn't available; neither is saslauthd, and there's no saslauthd anyway). Worse, illumos has no auxprop plugins, so the whole thing is rather useless. Note that rebuilding sendmail alone won't fix this, as the problem is in the underlying sasl implementation.

The above notes were developed on Tribblix, but ought to apply to any illumos distribution using the vanilla illumos sendmail+sasl combination.

Wednesday, November 01, 2017

Building illumos-gate on AWS

Having talked about running Tribblix on AWS, one of the things that would be quite neat would be to be able to build illumos-gate.

This is interesting because it's a relatively involved process, and might require proper resources - it's not really possible to build illumos inside VirtualBox, for instance, and many laptops don't run illumos terribly well. So it's hard for the average user to put together a decent - most likely dedicated - rig capable of building or developing illumos, which is clearly a barrier to contribution.

Here's how anyone can build illumos, using Tribblix.

Build yourself an EC2 instance as documented here, with 2 changes:

  1. The instance type should be m4.large or bigger - m4.xlarge or c4.xlarge would be better. The bigger the instance, the quicker the build, but m4.large is pretty much the minimum size.
  2. Attach an EBS volume to the instance, at least 8G in size. If you want to do multiple builds, or do lint or debug builds, then it has to be larger. I attach the volume as /dev/sdf, which is assumed below. (You could keep the volume around to persist the data, of course.)
Once booted, log in as root. You then need to set up the zfs pool (the disk showing up as c2t5d0 below matches the /dev/sdf attachment point) and create a couple of file systems that can be used to host the build zone and store the build.

zpool create storage c2t5d0
zfs set compression=lz4 storage
zfs destroy rpool/export/home
zfs create -o mountpoint=/export/home storage/home
zfs create -o mountpoint=/export/zones storage/zones

You should then do an update to ensure packages are up to date, and install the develop overlay to get you some useful tools.

zap refresh
zap update-overlay -a
zap install-overlay develop

Then create a user, which you're going to use to do the build. For me, that is:

groupadd -g 10000 it
useradd -g it -u 11730 -c "Peter Tribble" -s /bin/tcsh \

  -d /export/home/ptribble ptribble
mkdir -p /export/home/ptribble
chown -hR ptribble:it /export/home/ptribble
passwd ptribble

Then create a build zone. It has an IP address, just pick any unused private address (I simply use the address above that of the global zone, which you can get with ifconfig or from the AWS console - note that it's the private address, not the public IP that you ssh to).

zap create-zone -z illumos-build -t whole \
  -i 172.xxx.xxx.xxx -o develop \
  -O java -O illumos-build -U ptribble

What does this do? It creates a new zone, called illumos-build. It's a whole root zone, with its own exclusive set of file systems. The IP address is 172.xxx.xxx.xxxx. The develop overlay is installed (in this case, copied from the global zone); the java and illumos-build overlays are added to this new zone (note the upper-case -O here). Finally, the user account ptribble is shared with the zone.

Give that a few seconds to boot and log in to it, then a couple of tweaks that are necessary for illumos to build without errors.

zlogin illumos-build
rm /usr/bin/cpp
cd /usr/bin ; ln -s ../gnu/bin/xgettext gxgettext

Now log out and log back in to the instance as your new user. We're going to create somewhere to store the files, and check out the source code.

mkdir Illumos
cd Illumos
git clone git://github.com/illumos/illumos-gate.git
wget -c \
  https://download.joyent.com/pub/build/illumos/on-closed-bins.i386.tar.bz2 \
  https://download.joyent.com/pub/build/illumos/on-closed-bins-nd.i386.tar.bz2

Now we set up the build.

cd illumos-gate
bzcat ../on-closed-bins.i386.tar.bz2 | tar xf -
bzcat ../on-closed-bins-nd.i386.tar.bz2 | tar xf -
cp usr/src/tools/scripts/nightly.sh .
chmod +x nightly.sh

There are two more files we need. Go to the tribblix-build repo and look in the illumos directory there. Grab one of the illumos.sh files from there and put it into your illumos-gate directory with the name illumos.sh. If you need to change how the build is done, this is the file to edit (but start from one of those files so you get one appropriate for Tribblix as the host). Also, grab Makefile.auditrecord and use it to replace usr/src/cmd/auditrecord/Makefile.

Now log in to the zone and start the build.

pfexec zlogin -l ptribble illumos-build
cd Illumos/illumos-gate
time ./nightly.sh illumos.sh

On an m4.xlarge instance, this took me just under 75 minutes. Look in the log directory and check that the mail_msg looks clean without errors, and you'll have the built files in the proto directory and an IPS repo under packages.

For more behind the scenes details on the illumos build process itself, look at the how to build illumos page.

Tuesday, October 31, 2017

Public Tribblix AMI now available

There's now a public Tribblix AMI available to run on AWS.

This was built according to the notes I gave earlier. And is part of making Tribblix the illumos for everyone.

This is to be considered slightly experimental, and there are a couple of constraints:

First, the AMI is only available in the London region for now (I'm in the UK, so that's where I'm running things). I could make it available elsewhere, but there are costs associated with doing so and, as everything related to Tribblix comes out of my own pocket, I'm not going to incur costs unless there's a demonstrable need. If you want to run in a different region, then you can always copy the AMI.

Second, the size of the image is quite small. Again, there's a constraint on cost. But the idea here is that you wouldn't store any non-trivial data in the image itself - you would create an appropriately sized EBS volume, attach that and create a zfs pool for your data. The Tribblix repo server does just that - the package repo lives on the second pool.

So, how to use this? I'm going to assume some level of AWS familiarity, that you have an account and know basically how to use AWS, and that your account is set up with things like an ssh key pair.

Go to the AWS console, and navigate to the EC2 dashboard. Unless you've copied the AMI to the region of your choice, make sure you're working in London - the dropdown is in the top right:


Then hit the launch instance button:


Now you get to choose an Amazon Machine Image (AMI). Click on "Community AMIs" and enter "Tribblix" or "illumos" into the "Search community AMIs" search box. At the time of writing, you'll only get one result, but more may appear in future:



OK, go and select that one. Then you can Choose an Instance Type. A great thing about Tribblix is that it's pretty lightweight, so the t2.micro - available on the free tier - is a good choice.



Click on "Review and Launch". On the next screen you can edit the storage to add an additional volume, but the one thing you must do is edit the security group.



If you leave it like that, you'll have no way to access it. So Edit it, and the simplest thing to do at this point is to create a new security group that allows ssh only, with the source being your own IP address, which you can get by selecting "My IP" from the source dropdown.



(I've got a saved security group that does just that, to let me straight in from home.)

Click on "Review and Launch" to go back to the main screen, and then "Launch". The is when you get to choose which key pair you can use to log in to your instance:



It will take a little while to start (although it's usually ready before the status checks say so), and you should then be able to ssh in to it (as root, with the key pair you set up).

ssh -i peter1-london.pem \
root@ec2-35-176-237-204.eu-west-2.compute.amazonaws.com

And you're good to go. What you do then is up to you; I'll cover some scenarios in upcoming posts. Be aware that the base AMI has a pretty minimalist set of packages installed, so you probably want to add some more packages or overlays to do anything useful.

Wednesday, September 20, 2017

The commoditization of IT?

IT, so the story goes, is now a boring commodity. But is this true?

Let's first define what a commodity is. There are a range of definitions we could use, but I'm going to think of a commodity as something that is functionally undifferentiated and available from multiple sources. The key aspect here is that of interchangeability (aka fungibility).

As an example, most computer components fall into the commodity category. Memory DIMMS, disk drives, network interfaces - you can (in principle) use any vendor's disk drives or memory and your computer will still work. You can use a mouse, keyboard, or monitor from any vendor and things will work just fine. Vendors have to differentiate in other ways - performance, cost, reliability, service.

What about smartphones? I would say that the phone piece is a commodity. Whether for a mobile or a land line, you can switch your telephone for another make or model, and you can switch from one telephony provider to another.

But the smart part of smartphones isn't properly interchangeable. You can't simply swap an Apple handset for an Android and carry on as you were; you have to switch everything to a different domain. And the suppliers here are keen to enforce differentiation and prevent interchangeability. We live in a world of proprietary walled gardens.

In most non-trivial cases, databases aren't commodities. Big database companies rely on the fact that you couldn't migrate to another database vendor even if you wanted to.

Operating systems are clearly differentiated. You can't swap Solaris for Windows, or either for Linux or BSD. You can't even treat different distributions as commodities if you restrict yourself to the Linux domain.

Although the operating system landscape is changing a little, in that Docker and containerization offer the prospect of interchangeability - you could, in theory, run a Docker image anywhere and on anything.

Cloud computing definitely isn't a commodity. (Thinking of it as a utility might be slightly more accurate.) Heck, there are sufficient differences over what's available that migrating between different AWS regions isn't smooth, let alone migrating between cloud providers.

Vendor lock-in is the big thing, and it's diametrically opposed to being a commodity - what vendor wants to make it easy for its customers to leave? (Despite that being one of the key attractions of any vendor in practice.)

One of the requirements for interchangeability is standardization, and there's a tension here between standardizing things (thereby making things the same) and innovation, which necessarily implies change. I could (and probably will at some point) go on at length about innovation, but I see precious little innovation in practice, more constant reinvention of the square wheel. Meanwhile the standards we have are either efforts like POSIX, which is largely codifying accidental implementations from the 1970s, or ad-hoc emergence of initial implementations that were cobbled together with little or no thought for actual suitability.

Rather than commoditization being a standard base, with a rising tide lifting all boats, any commoditization chips away the good stuff to leave the lowest common denominator, while everyone deliberately introduces incompatibilities in the name of differentiation.

So it seems to me that, far from being commoditized, IT has been monopolized and mediocritized.

Sunday, September 10, 2017

Tribblix - illumos for everyone?

When I was doing a bit of a branding exercise for Tribblix, part of which generated the rather amateurish logo I now have - something I needed to make some business cards and stickers to take to FOSDEM this year - one of the things I wondered about was a good tagline.

In the end, I ended up with "the retro illumos distribution". Of course, it was pointed out that illumos was retro enough on its own, so the idea of a retro variant was a bit unnecessary.

The other tagline I came up with was "illumos for everyone". I rejected it in the end because it was a bit preposterous - I'm not really building something for everyone.

Yet the underlying idea here was simple - that I would actively seek to build a distribution that was inclusive, not exclusive. That's why:

  • I have a SPARC version as well as x86
  • On x86, I support 32-bit as well as 64-bit systems
  • Tribblix is suitable for both desktop and server use
  • Tribblix is a flexible system, not an appliance or hypervisor
  • I work on ensuring Tribblix will work successfully on systems with more minimal resources than other distributions
  • A variety of installation methods are supported - media, network, iPXE
  • I've worked on installation in the cloud, both KVM-based and AWS, in addition to bare metal or other hypervisors
  • I've tried to make key features such as zones easier to configure and use

Generally, the idea is to reduce the barriers and limitations for installing and using Tribblix.

This summer, I came across a much better way of putting it. Rather than "illumos for everyone", a variation of the UK Labour Party's slogan expresses the idea much more elegantly. Tribblix would be "illumos for the many, not the few". It's a shame that the slogan is already taken, as it expresses the philosophical aim rather neatly.

Thursday, August 03, 2017

Creating a decent Tribblix AMI

Previously, I've described how I created my first Tribblix AMI, then how to do it properly in hvm mode so you can run on modern instances in all regions.

That creates something that will work, but is it actually in a state that's useful?

The first thing is to add an EC2 credential service. That's the thing that will query for metadata and install the keys on the system so you can log after the instance is created. I tried the ec2-credential service from OmniOS, but for some reason it didn't work right on Tribblix. I've tweaked mine a little, forcing it to run after the network comes up, adding retries in case there's a problem, and also disabling it in non-global zones.

Of course, there's more instance metadata that I could query and use, but I haven't yet had a need for anything other than the initial key.

The other thing I've been wondering about is the configuration of Tribblix itself - specifically what the storage should look like and what the default software installation should look like.

My image is built on an 8G "disk" or EBS volume. That might seem a little small, but remember that Tribblix is pretty lean and mean. For a typical server configuration you'll probably be looking at about 1G or so, and that's without any special work. The most annoying thing here is that by default you lose 2G to each of dump and swap, so that's effectively half the disk gone. There's opportunity to modify those, especially as I'm typically using t2.micro instances on the free tier that only have 1G of memory. You might not even want dump at all. As for swap, you do want some (so that anonymous reservations don't eat into actual RAM) but you could cut that down a bit.

As I'm writing this I do wonder whether I could pull some of the instance metadata and shrink the dump and swap volumes appropriately.

The assumption I'm making here, though, is that if you're storing any reasonable amount of data that you're going to attach a separate EBS volume, and you can then size that appropriately to the need at hand. (And you can then move that data around independent of your running instance.) So I think that keeping the root volume fairly small is reasonable. It also keeps my AWS bill down, an important consideration as any charges here come out of my own pocket.

Then, what should the baseline software install look like? Tribblix uses overlays, and there's an assumption that you always start from the base overlay. I'm currently using a dedicated overlay that pulls in cli-tools - essentially you get basic shells, compression tools, basic utilities, but not much else. Many of the normal server utilities don't apply to running in the cloud, as they're aimed at monitoring or managing hardware.

The base set of packages is that installed on the ISO. That includes most storage and network drivers, which are irrelevant - on EC2 you know exactly which drivers you need, so almost all the drivers that are installed are unnecessary. What I need here is a better way of handling installation variants, so it knows the drivers aren't supposed to be there - at the moment I could remove them, but updates and upgrades would simply put them back. In the same vein, I could only ship a 64-bit kernel, as we know there are no 32-bit instance types available.

At the moment I have an LX variant, which is a bit of a hack in terms of the way I've packaged it together, but as the number of interesting variants grows I'm going to have to come up with a better way of handling it, especially as you might want multiple variants together - for instance a 64-bit LX-enabled cloud-optimised image.

Monday, July 31, 2017

Building a Tribblix AMI - hvm mode

After having created a Tribblix AMI to prove that Tribblix basically works on EC2, I then moved on to the next issue - how to create an AMI that will run in hvm mode?

As a reminder, pv mode AMIs are deprecated, aren't supported by all instance types, and don't work in all regions. So you really need something that runs in hvm mode.

The first thought might be to convert the existing pv image to a hvm image. I've tried that and, while you can do the conversion, the image doesn't actually work. The problem here is that ZFS has the physical paths of the devices it's installed on embedded in the pool metadata. Changing from pv to hvm mode changes the emulated hardware, in particular the disk paths, so the ZFS pool isn't where it thought it was and the system panics. If you have a mismatch between the disk layout where the pool was created and where you're running you'll get a panic something like this:



If you had console access and could boot from media you could fix this, but AWS doesn't provide that. (And if you could boot from media you could just do a regular install without all the shenanigans involved in producing an AMI.)

So, you have to create the image on a system that looks like EC2. Which means using xen.

Fortunately, this road has been travelled before. These instructions are exactly what you need. They're for OpenIndiana, but will apply to any illumos distribution. And they're the process used by the OpenZFS project to do their testing. (I'll also mention that the OpenZFS folks have put a number of fixes back into illumos that improve the EC2 experience for us.)

I'm not going to repeat those instruction, that would be boring, so I'll talk about what I had to do or change to make those instructions work for me.

I got one of my spare desktop PCs out and installed Ubuntu 16.04 on it. (I must be spoilt by Tribblix, the Ubuntu install was horrendously slow and very high maintenance.) And then installed xen, rebooted as dom0, and set up the bridge networking.

That was my first pothole. There's this thing called systemd that's come along, and it changes the way network configuration is done. Much cussing and googling, but I got it right first time.

Then I discover that there's a new toolstack here. It's all xl not xm, but otherwise seems the same.

I then tried to start a VM, only to be given a completely meaningless and unhelpful error message. Why tell the user what's wrong when you can just vomit a stack trace?

After a bit of head-scratching I worked out that the system didn't actually support hvm mode. If you run xl info and look for virt_caps, it should mention hvm. That's a bit odd, the sticker on the front of the box looks right.

Manufacturers ship hardware with VT-x disabled in the BIOS, it appears. Into the BIOS we go, to find that the relevant settings are greyed out and you need a BIOS password to get into them. Open the box and start looking for jumpers. Fortunately I found a helpful article - the key here was the bit about the jumper being blue, little details like that make all the difference.

OK, so having wiped the BIOS password, gone into the BIOS and enabled VT-x, I go back to xen. Looking at virt_caps now shows hvm, as it should, and my domain starts.

The idea here is that you connect to the console with VNC. Easy enough, but by the time I had got my ssh tunnel set up and started up my VNC client, my VM had gone. I started it again, it starts booting just fine but then issues a few warnings and then a kernel panic. It's all over pretty quick.

In order to catch what it said, I then used vnc2flv. Someone asked me about screen recorders a while back, and I suggested they did what they wanted to do in a vnc session and use vnc2flv to record it. But it's the same here. Once I had the session recorded I can watch the movie and pause it to see what errors it's spitting out.



This, I think, is related to illumos bug 7186. It looks like we can't handle the network presented by newer versions of xen.

To get round this I simply disabled the network interface in the VM definition. Then the VM boots just fine and can be installed. You're a little bit limited in that you can't do updates but, as long as nwam is enabled then it will get itself on the network when you do run it on something that does have a compatible network.

For OmniOS, this means you have to manually enable nwam, as they have networking switched off by default. And remember that you must have networking enabled if you're running on EC2 as there's no other way to access your system.

What you'll also need to ensure at this point is that you have a functional user account you can get in to via ssh. With Tribblix and OpenIndiana you have jack, other distros might need to create a user. You wouldn't want that on a production AMI, of course, but you need to be able to log in to the system the first time in order to complete any configuration and add the various bits of AWS integration that you'll need.

Having got my image installed I followed the instructions through and got an AMI that works just fine.

The configuration file I used is:

builder='hvm'
name='ami-template'
vcpus=1
memory=1024
disk=[  'file:/var/tmp/tribblix-0m20.1.iso,hdb:cdrom,r',
        'file:/root/ami-template.img,xvda,w' ]
boot='d'
vnc=1
vnclisten='0.0.0.0'
vncconsole=1
on_crash='preserve'
xen_platform_pci=1
serial='pty'
on_reboot='destroy'


The one crucial thing here, apart from not having a vif line to create a network, is that you must use xvda for the disk. That's what EC2 will present to you, if you use something else you'll get the same panic on boot that I saw when attempting to convert a pv image.

We're almost done. Next time I'll talk about how to go from something that minimally boots up to something that's done well.

Running illumos on AWS - the first Tribblix AMI

I've run Tribblix on all sorts of hardware - desktops, servers, even the occasional laptop. I've had success running it on some of the smaller cloud providers that allow you to install from a custom ISO, or iPXE, such as my adventures with Vultr.

However, running on AWS has eluded me. You might wonder why you would want to, but the reality is that AWS is a huge player, with many people turning to it as their default (and often only) option. So giving everyone who uses AWS access to Tribblix would be a good thing, and would also offer an easy route for people who might want to play with Tribblix to do so.

The first thing to realize is that AWS is not so much a single cloud as a set of independent clouds. Each region is independent, and has a different set of capabilities. For example, EFS is only available in a few regions. These differences can affect us.

On AWS, there are 2 different types of guest. We have pv (the older, paravirtualized) and hvm (the newer, hardware assisted). Any given AMI (Amazon Machine Image) will only run as either a pv or a hvm guest. And some EC2 instance types are pv, others hvm. Newer regions (such as London) are exclusively hvm, so pv isn't an option.

Building an AMI from scratch looked a little daunting, so I looked to see what other illumos distributions might have made AMIs available. If you go to the community AMI page when launching an instances, the only one you'll find is OmniOS. They even have a page explaining how it was done. The snag is that all their images are pv. For my first set of experiments then, I was operating in the Dublin region.

The OmniOS AMI boots up just fine and works pretty much as you would expect. No problems there. How to get Tribblix running though?

The answer lies in the beauty of ZFS and Boot Environments. The basic approach here is to take a running OmniOS image, create a new Boot Environment, install Tribblix into that Boot Environment, and make the Tribblix Boot Environment the one to boot from next time. Once I've successfully booted the Tribblix image, I can clean up and delete the original OmniOS files.

One of the advantages of Tribblix is that I have my own installer. It's quite a bit simpler than some of the other distros, and thus much easier to mangle to do things in new environments. I decided to use the iPXE image as used in my Vultr experiment, because it was easy and I had it to hand. I then wrote a modified installer script (source here) called img_install that was based on my over_install script used to drop Tribblix into an existing ZFS pool. The difference is that the old over_install was run in the context of a Tribblix Live CD; the new img_install is run in the context of an alternative distro. The other thing in that script is that I don't do any boot loader fiddling - the pv instances have a special pv-grub, which I'm careful not to touch.

(By the way, the same trick will work for other illumos distributions. You just need a source archive of some sort and a script to unpack it. For example, I have a script to unpack some of the ISO images in the tribblix-zones repo, which I use to create alien-root zones. It's the same idea of installing an image in a alternate path.)

So all that was involved was to:
  • Start up an OmniOS instance (a micro instance on the free tier works fine)
  • Run the img_install script to create the alternate BE
  • Reboot, so you boot into Tribblix
  • Delete the old OmniOS BE
  • Finish off the install and apply updates
Then you can do the normal create an image trick on the AWS console, and you have a nice shiny Tribblix AMI.

That all worked out just beautifully. Tribblix runs on EC2 just fine.

In the next article, I'll describe how to create a hvm AMI.