Sunday, April 14, 2013

Zip Archive Packaging

Under the hood, Tribblix uses the traditional SVR4 packaging utilities. There are a number of reasons for this - compatibility, simplicity, and a low footprint are among them. They're also good enough to get the job done. (And my strong belief is that the underlying package tools should become invisible and thus their implementation irrelevant, so the simpler and smaller the better.)

While SVR4 packaging does support installation of packages from networked locations over http, the support isn't great. The native support was almost never used in practice and its implementation is pretty poor (so much so that I would much rather just rip it out to simplify the code).

Allowing package installation from network repositories is expected of any modern system. However, the packaging system itself doesn't need to do so natively. There are any number of utilities and toolkits to do the network retrieval part - curl, wget, and essentially every modern scripting language will do the job.

Which leaves only the question as to what format to use in putting the data on your networked repository. The requirements here are:
  • A package is packed up into a single file, to allow easy and efficient transfer using any medium
  • The package should be compressed
  • The contents of the package should be easily accessible on any platform without special tools
  • A file should be able to contain multiple packages
If you look at SVR4 packaging, it has two native formats - filesystem and datastream. The former is simply all the files in the package laid out in a directory hierarchy, the latter is a single-file format. However, package datastream isn't generally suitable - it isn't natively compressed, and it's a private format that can't be easily accessed without the SVR4 tools.

The alternative solution I'm using is to simply zip up the filesystem format into a zip file. Hence, Zip Archive Packaging or zap for short.

This has the following advantages:
  • Single file, can contain multiple packages
  • Natively compressed
  • Widespread support to unpack the archives
  • Efficient random access
  • Efficient extraction of list of contents
  • Widely used in other contexts (eg. jar, war files)
  • Some level of data integrity checking
  • No need for any additional tools
  • Supports extensibility for additional functionality later
Now, the standard widely used versions of zip don't support much compression beyond DEFLATE. Newer versions do, but availability isn't universal. So I limit myself to basic DEFLATE - although you can compress better than regular zip.

So installing a package from a network repo in Tribblix is down to a very simple shell script that runs curl + unzip + pkgadd.

No comments: