Sunday, October 09, 2016

zfs receive oddity

Every so often, even a system as good as zfs will throw you a curveball. This one threw me for a while, and here's a simplified example.

All I'm trying to do here is replicate one file system. So I create it, touch a file so I know it's made it.

zfs create -o rpool/t1
touch /rpool/t1/1

OK, snapshot it and send it.

zfs snapshot rpool/t1@t1s1
zfs send rpool/t1@t1s1 | zfs recv rpool/t2

Create another file, and create a snapshot at both source and destination.

touch /rpool/t1/2
zfs snapshot rpool/t1@t1s2
zfs snapshot rpool/t2@t2s2

And now send an incremental stream from the original.

zfs send -i rpool/t1@t1s1 rpool/t1@t1s2 | zfs recv -F rpool/t2

That works, the whole point of the -F flag is to discard any subsequent changes. (You'll usually need this if the file system is mounted at the receiver, because even access time updates count as updates that will need to be discarded.) It will roll back rpool/t2 to the original @t1s1 snapshot, discarding the local @t2s2 snapshot, then update the rpool/t2 file system to the @t1s2 snapshot.

So far so good.

Now a minor variation.

I create it, touch a file so I know it's made it.

zfs create rpool/t1
touch /rpool/t1/1

OK, snapshot it and send it.




zfs snapshot rpool/t1@s1
zfs send rpool/t1@s1 | zfs recv rpool/t2


Create another file, and create a snapshot at both source and destination.



touch /rpool/t1/2
zfs snapshot rpool/t1@s2
zfs snapshot rpool/t2@s2

And now send the incremental stream just like last time.

zfs send -i rpool/t1@s1 rpool/t1@s2 | zfs recv -F rpool/t2

Kaboom. This fails, reporting:

cannot restore to rpool/t2@s2: destination already exists

What? The problem is hinted at in the zfs man page, where the description of -F says:

If receiving an incremental replication stream (for example,
one generated by zfs send -R [-i|-I]), destroy snapshots and
file systems that do not exist on the sending side.

The problem, then, is that zfs won't destroy the @s2 snapshot that exists at the receiver, because a snapshot of the same name exists in the source. It's not the same snapshot, of course, but it has the same name. This prevents the rollback, and the receive fails.

Snapshot name collisions are pretty common. We have an automatic snapshot regime, so pretty much every file system we have has a daily snapshot that embeds the date, and being automatic, they all have the same name.

What this means in practice is that if you have snapshots created on the receiving side, you'll have to explicitly roll the file system back to the snapshot you sent to previously, to avoid hitting name collisions.

I think this behaviour is wrong, although I'm not quite confident enough to call it a bug. The point is that on the receiving side, any snapshots created after the one that was sent are irrelevant - it shouldn't matter what their names are, and I'm not at all sure why zfs even bothers checking the names of snapshots that ought to be deleted.



Wednesday, October 05, 2016

Cats versus Petals

It's become common to talk about Pets versus Cattle as the "new way" of thinking about servers.

Of course, "the new way" isn't really new - many IT shops in the mid 1990s had fully automated, reproducible, and disposable infrastructure. It's just the term that has recently become trendy, and I don't think the analogy is necessarily right.

In the original analogy, the claim was that a Pet is precious, so you care and feed for it specially. If it's sick, you nurse it back to health. Whereas if one of your herd of Cattle gets sick, you take it out back and shoot it. This is based purely on emotional attachment, and makes little business sense. The truth is more that most Pets have little financial value, whereas Cattle are intrinsically valuable. Whether sick Cattle are nursed back to health should be a pure business decision based on the value of a healthy animal compared to the cost of treating it.

Currently, I think a more appropriate analogy would be Cats versus Petals.

Let me explain.

A Cat system has a mind of its own. In fact, it isn't at all clear whether you own the system or the system owns you. Cat systems tend to be solitary and not integrate or interoperate well with others. If you have many Cat systems, they will tend to wish to go their own ways.

In contrast, Petals will be small, simple systems. You will have many, and they will be the same. While a Petal may have some value of its own, their true beauty is only visible when they are put together into larger units - flowers, for example. Different flowers are made up of different types of petals.

One point here is that if you're thinking about Pets and Cattle, you're still thinking of individual animals. With Petals, the role of holistic thinking and orchestration in producing a larger object (the flower, or even the garden) becomes clear.

In terms of terminology, your business is a garden; the services you provide are flowers; they are constructed from containers as the petals via an orchestration service that provides the stems and branches. Your job is to ensure good soil, water and light, prune, remove pests and weeds - not to create each individual Petal by hand.

If you're still herding Cats, it's time to stop and tend gardens instead.