Every now and then, some spurious peaks show up on munin graphs. The peaks are order of magnitude higher than the expected range of the data. This particularly happens with DERIVE plugins, that are notably used for network interfaces.

One way to fix this, as suggested by Steve Schnepp (and in the faq), is to set the maximum straight into the RRD database, and then let it reprocess the data to honour this maximum.

Continue reading

I use Munin to monitor a few machines, and bubble up alerts when issues show up. It’s pretty good, easy to set up, and has a large number of contributed plugins to monitor pretty much everything. If still out of luck, it’s easy enough to write your own.

To ease the task of viewing the data, each machine runs munin-node, but only a couple of masters do the data collection with munin-update. This works reasonably well, except that machines monitored by more than one server need to work extra time to provide the same data to both.

Fortunately, Munin 2.0 introduced a proxy mode, allowing to decouple running the plugins to collect fresh data (with munin-asyncd) from giving that data to collection servers (via munin-async).

Setting this up is relatively easy, and the benefits show quickly, in the form of a reduced collection time, and fewer gaps in the data.

Reduced update time for the master, and no more gaps in the data.

Surprisingly it also showed as a substantially reduced load on low-power machines. But beware of the --fork parameter to munin-asyncd.

Continue reading