At the time of this writing, this blog runs on a Bitnami WordPress image, but I have changed the configuration to run multiple sites (WP_ALLOW_MULTISITE and MULTISITE in the wp-config.php). I realised I had issues running scheduled events using DISABLE_WP_CRON when the ActivityPub plugin failed to send new posts to subscribers. This was confirmed by the site health dashboard, indicating that scheduled events were late.

As it turns out, when manually running the script with sudo -u daemon /opt/bitnami/php/bin/php /opt/bitnami/wordpress/wp-cron.php (with WP_DEBUG enabled) complains of an undeclared HTTP_HOST, and terminates quickly. As soon as I set that variable in the environment and reran the script, the warning was gone, and the script took longer to run. All my recent post also made it to the fediverse!

Continue reading
Screenshot of a terminal showing oft-used commands

As idle musing, and a way to show off my mastery of shell pipelines, I was wondering what my most-used shell commands are. It’s an easy few commands to pipe.

history | sed 's/^ *//;s/ \+/ /g' | cut -d' ' -f 2 | sort | uniq -c | sort -n | tail -n 20

The outcome is rather expected. I feel validated (by my shell) in my own self-perception!

Continue reading

When migrating a database from MySQL to PostgreSQL, I bumped into a slight issue with timestamp formatting. PostgreSQL supports many date/time formats, but no native support to output ISO-8601 UTC time & date format (e.g., 2023-08-05T13:54:22Z), favouring consistency with RFC3339 instead.

ISO 8601 specifies the use of uppercase letter T to separate the date and time. PostgreSQL accepts that format on input, but on output it uses a space rather than T, as shown above. This is for readability and for consistency with RFC 3339 as well as some other database systems. https://tools.ietf.org/html/rfc3339

Fortunately, StackOverflow had a solution, including some notes about how to handle timestamps with timezones.

SELECT to_char(now() AT TIME ZONE 'Etc/Zulu', 'yyyy-mm-dd"T"hh24:mi:ss"Z"');
A few HomeAssistant cards showing SNMP monitoring of speed and quota of an upstring ISP.

I finally fell for the smart-home mania when I needed to read a few Zigbee climate sensors, and started using Home Assistant. There was no return from it, and I gradually grew the number of sensors and automations. This is all the easier thanks to a very active community site, offering many a recipe and troubleshooting advice. This is where I found a bandwidth monitor based on SNMP metrics that has been functional for a while.

My ISP, Internode (no longer the awesome service it used to be 10 years ago), has become increasingly flaky, silently dropping support for their Customer Tools API. This API was useful to track quota usage in a number of tools, including my own Munin plugin. Because of this, I unwittingly, and without warning, went beyond my monthly quota this month. I had to double my monthly bill to buy additional data blocks to tie me over.

It became obvious that I needed a new way to track my usage. What could be better than HomeAssistant, which was already ingesting SNMP data from the router? I posted my updated solution in the original thread, but thought that it might be worth duplicating here.

Continue reading

I talk about restoring backups often recently. This is because the disk on my trusty bare-metal server died. This gave me the opportunity to reassess my hosting choices, and do the ground work to move from where it was to where I want it to be.

One of those changes is moving static website hosting away from a Apache HTTPd, running on an OS I administrate (read: “frequently broke”), to a more focused and hands-off system in the cloud, AWS S3 with a CloudFront CDN (more on this in a later post).

Unfortunately, decades of running Apache have left me with a number of static sites using some on-the-fly templating by relying on Server-side Includes (SSI). Headers, footers, geeky IPv6 and last-modified tags, … none of those work with a truly static host. I needed a solution to render those snippets into full pages.

At first, I thought I’d just write a simple parser in Python. I quickly gave up on the idea, however, when I realised I used included templates with parameters. Pretty nifty stuff, but also not trivial to write a parser for.

Then I realised I already had the perfect parser: Apache. All I needed was to let it render all the pages one last time, and publish those instead! This was packed quickly with a relatively simple Docker container, and the trusty wget. The busy person can find a Gist of the Dockerfile here.

Continue reading

We’ve been having some fun with Click and Python decorators at work.

We had a situation where we wanted to

  1. transform any Exception to a click.ClickException, so they would be rendered nicely, and
  2. catch one particular exception, and retry the function that raised it with a different parameter value as a fallback.

We got the first behaviour quickly into a decorator. We then realised that the second could also be done nicely with a decorator, too.

Continue reading

GitHub now allows to expand/collapse all files in a PR diff at once (pressing Alt while clicking one of the toggles). Unfortunately, there is no similar feature to mark all files as viewed. This is handy after having reviewed meaningful changes to file, and automatically modified/generated files can be ignored.

So here goes a one-liner for the JS console.

Array.from(document.getElementsByClassName('js-reviewed-toggle')).forEach(c => c.getElementsByTagName('input')[0].checked || c.click())
Continue reading
Target metrics

I wrote this article for the Learnosity blog, where it originally appeared. I repost it here, with permission, for archival. With thanks, again, to Micheál Heffernan for countless editing passes.

In this series, I look at how we load test our platform to ensure platform stability during periods of heavy user traffic. For Learnosity, that’s typically during the back-to-school period. The year was different though, as COVID caused a dramatic global pivot to online assessment in education. Here is what the result of that looked like in terms of traffic.

Weekly Learnosity users comparison 2012--2020

We expect major growth every year but that kind of hockey stick curve is not something you can easily predict. But, because scalability is one of the cornerstones of our product offering, we were well-equipped to handle it.

This article series reveals how we prepared for that.

In part one (which was, incidentally, pre-COVID), I detailed how we actually created the load by writing a script using Locust. In this post, I’ll go through the process of running the load test. I’ll also look at some system bottlenecks it helped us find.

Let’s kick things off by looking at some important things a good load-testing method should do. Namely, it should

  1. Apply a realistic load, starting from known-supported levels.
  2. Determine whether the behaviour under load matches the requirements.
    • If the behaviour is not as desired, you need to identify errors and fix them. These could be in
      • the load-test code (not realistic enough)
      • the load-test environment (unable to generate enough load)
      • the system parameters
      • the application code
    • If the behaviour is as desired, then ramp up the load exponentially.

We used two separate tools for steps 1 above (as described in the first part of this series) and tracked the outcomes of step 2 in a spreadsheet.

TL;DR

  • We used Locust to create the load, and a custom application to verify correct behaviour.
  • We found a number of configuration-level issues, mainly around limits on file descriptors and open connections.
  • Stuff we learned along the way:
    • Record all parameters and their values, change one at a time;
    • Be conscious of system limits, particularly on the allowed number of open files and sockets.
Continue reading