Sunday, March 14, 2010

Beyond sar

The old standby for recording historical system activity is sar - system activity reporter. There are many alternatives, both free and commercial, but sar has the advantage that it comes with the OS, and pretty much any version of any (unix-like) OS.

Because it's there, we use sar, saving it's output into a big archive and using tools like sar2rrd to produce charts. (It's not the only thing we use, of course.)

The problem is that, particularly on Solaris, sar is terrible. The data it collects is woefully incomplete - network data is the worst, being completely absent, but there's much more missing. Some of what is present is aggregated away so that much of the details is lost. And the list of what's present is fixed, so the whole framework is completely non-extensible.

So, I'm fed up with that, and need to do better. Note that most tools out there don't help with capturing all the data, as they have their own preconceived notions of what data might be useful (although they are generally far more complete than sar).

Enter kar - the kstat activity reporter. This is really amazingly simple. Given that (almost) all the performance data you want is obtained from kstats, simply save all the kstats on a regular basis. The implementation I have here is to save kstat -p output into files inside zip archives. Now, that's not ideal, but it has some advantages: it's almost zero effort, it gives complete coverage, and it's naturally extensible. If it works out and is found to be useful, more optimal mechanisms could be defined.

I've said it a couple of times above, but I'm going to say it again: the key advantage here is that the data is complete and thereby naturally extensible. I don't want to enhance sar by trying to cherry-pick interesting statistics (and we could all argue for months about what might go on the list). By saving everything you automatically pick up anything new that's added. And you let consumers decide which of the statistics are interesting when you get to the post-processing phase. Say I wanted to look at the historical behaviour of the zfs ARC - no problem, it's all there in the kstats.

Using kstat -p is a convenient shortcut, but does have other advantages. Because the output is textual, all your favourite analysis tools - awk, sed, perl, grep, python, whatever - can munge the data with no effort. And you can chuck the data into your graphing application of choice.

If that wasn't enough, jkstat 0.35 has support for reading in the output of kar in both the browser and chart builder.
./jkstat browser -z /var/adm/ka/ka-2010-03-01.zip

or
./jkstat chartbuilder -z /var/adm/ka/ka-2010-03-01.zip

will do the trick.

No comments: