We’ve been having some fun with Click and Python decorators at work.

We had a situation where we wanted to

  1. transform any Exception to a click.ClickException, so they would be rendered nicely, and
  2. catch one particular exception, and retry the function that raised it with a different parameter value as a fallback.

We got the first behaviour quickly into a decorator. We then realised that the second could also be done nicely with a decorator, too.

Continue reading

I wrote this article for the Learnosity blog, where it originally appeared. I repost it here, with permission, for archival. With thanks to Micheál Heffernan for countless editing passes.

The dramatic increase in Learnosity users during the back-to-school period each year challenges our engineering teams to find new approaches to ensuring rock-solid reliability at all times.

Stability is a core part of Learnosity’s offering. Prior to back-to-school (known as “BTS” internally) we load-test our system to handle a 5x to 10x increase on current usage. That might sound excessive, but it accounts for the surge of first-time users that new customers bring to the fold as well as the additional users that existing customers bring.

Since the BTS traffic spike occurs from mid-August to mid-October, we start preparing in March. We test our infrastructure and apps to find and remove any bottlenecks.

Last year, a larger client ramped up their testing. This created a 3x usage increase of our Events API. In the process, several of our monitoring thresholds were breached and the message delivery latency increased to an unacceptable level.

As a result, we poured resources into testing and ensuring our system was stable even under exceptional stress. To detail the process, I’ve broken the post into two parts:

  1. Creating the load with Locust (this piece)
  2. Running the load test (in part two, coming soon).

TL;DR

Here’s a snapshot of what I cover in this post:

  • Our target metrics.
  • How we wrote a Locust script to generate load for a Publish/Subscribe system.
  • Our observations that:
    • The load test must reflect real user behaviours and interactions
    • Load testing alone doesn’t validate system behaviour against target metrics. It’s better to measure this separately while the system is under load.
Continue reading