How to Tame your Locusts or Performance Testing with Locust.io
The Locust performance testing platform has already been in existence for 9 years and it’s a little surprising that it is not more widely used within the IT community. Also it’s surprising that while there is plenty of information available on the internet to enable you to get quickly up and running with the platform, the number of published case studies about how to implement complex performance tests with the help of Locust are relatively rare.
The ambition of this article is to expand that list of case studies and provide some inspiration about:
The task was clear: test the performance of a website generated by a CMS system. To verify the optimal settings of the shared infrastructure, 3 test scenarios were defined:
- Basic: simulating the usual type of load generated by website visitors
- Extended Basic: combining the usual type of load generated by website visitors with a situation where the CMS system compiles the website after the content has been edited
- Editing: simulating the load generated by editors when changing the website content and the CMS system prepares a preview of how the page will look with the edits.
Technologies and Tools Used
Locust was evaluated and selected as the foundation of the performance test framework, due to the following key characteristics:
- High performance: ability to simulate from 1,000 to 10,000 virtual users on a regular PC
- Flexibility of the Python language, with its community support, either for utilizing existing or developing new libraries for implementing less standard or indeed specialist tasks during the implementation of test scenarios
- Readiness for integration into CI/CD pipelines
Some of the other advantages of the Locust platform can be found directly in the documentation on the Locust.io, which also explains its basic principles and provides additional basic use cases. In addition, on the Internet you can find this and a number of other tutorials. If you are coming across Locust for the first time, then I recommend that you look first at theses links before continuing with the rest of the article.
The following diagram shows the architecture of our framework with Locust at its center:
Architecture of Locust based performance test framework
Configuration of Tests and Test Data
Because one of our goals was to enable the developers to run the tests without the assistance of a specialized performance test engineer, the test run parameter settings and test data were pulled into a configuration file:
This is a standard Python file, which provides flexibility by defining the test parameters in data structures. For example, LOAD_STAGES and SITES are variables with native Python data types, but they are still very readable and understandable for testers who are not deeply technical (of course, it also helps to provide relevant comments).
We also recommend to encapsulate the run-time data, the data utilized during the test run, into dedicated classes:
The code sample above demonstrates how easy it is to have either simple or complex logic with data objects for working with data. This is useful, for example, if data records are used repeatedly and at the same time one data record cannot be used for multiple virtual users running in parallel. Some examples will be given in the section Test Scenarios Implementation, below.
Elastic — Storing Test Results
Although Locust's web UI enables you to monitor a number of statistics sufficient to provide a basic overview of the performance test run, Locust does not have its own robust data analytical or reporting layer. At first glance this might seem a bit of a deficiency. However, these days there are dedicated tools for this and Locust can be directly integrated with any BI solution. Regardless of whether it will be integrated, as in or case, with Elastic or with some other solution, we will always need to store two categories of data:
- Monitored performance metrics of the loaded application, typically response time, job execution time, utilization of infrastructure resources (CPU load, memory utilization, number of active nodes, etc.), and so on.
- Environmental parameters for the current test run, for example current number of virtual users, number of requests processed in parallel.
If you have already familiarized yourself with the basics of Locust, you will know that a basic set of statistics for each request and subsequent response is visualized on the web UI. You may also have noticed that Locust visualizes the number of virtual users in a graph, which is also useful information for us. So how do we send the data on which statistics are calculated to an external database? The following code adds listeners to the selected Locust event:
The first two listeners, additional_success_handler and additional_failure_handler, basically just forward the natively captured statistics for successful and non-successful requests "somewhere" using forwarder.add(message).
The implementation of the forwarder object, which encapsulates the integration with Elastic, will not be discussed here. Everything you need to know can be found in Karol Brejna's article Locust.io experiments — Emitting results to external DB.
The second two listeners, on_test_start and on_test_stop, which run at the start and at the end of the test are examples of custom loggers implemented over the native Locust loggers, log_VUs and log_memory. Note that the loggers that start with the gevent.spawn() command. This causes the logger to run in a so-called Greenlet. This allows the loggers to run in parallel with the running test, without blocking an entire CPU process for themselves.
Greenlet architecture is in itself quite interesting. It has been implemented in Locust through the gevent library. It's no secret that it is thanks to this feature that Locust’s performance beats some of the traditional tools, such as jMeter. The main idea behind Greenlets is that larger tasks can always be divided into smaller sub-tasks and we can “jump” execution between these sub-tasks (so called "context switching"). Several jobs can then be executed simultaneously within one CPU process, unlike the parallel thread processing pattern, where each job consumes exactly one process, i.e. one CPU thread. You can learn more about this on the homepage of the gevent project or in the nice tutorial gevent For the Working Python Developer.
The log_VUs and log_memory loggers are class instances of the VUs and Memory classes from the loggers.py module:
In the code we can again see a forwarder object that handles the routing of log records to the external database. With regards to the Greenlet architecture, the key command is gevent.sleep(1), which says: put the current greenlet (task) to sleep in 1s and release its resources (so-called yielding) to allow processing of an another greenlet (for more details, see the links about the gevent library above). As a result, the logging data is sent to the database every second.
Note that the class Memory the service call returns the current state of memory utilization on a specific Azure box (server). The exposure of this monitored endpoint allows you to orchestrate the logging directly within Locust. If we did not have such an endpoint available, we would have no other option other than to send the data to our logging database directly from some infrastructure monitoring. This approach of course has the disadvantage we usually face during performance test execution of having to cooperate with infrastructure management staff.
You might already be wondering how the data sent to Elastic is visualized. However, how to work with Kibana and properly structuring source data from performance test execution requires a separate article, which will follow shortly.
Although Python can do almost anything, sometimes its optimal to find some alternative means. In our case this was to use PowerShell jobs to simulate changes to page content in the Azure environment. Integration with the external module was fairly trivial, it was enough to run the selected PS job and parse the output (to check that there was no error and possibly find some required information).
Test Scenario Implementation
Locust, with just a just a little programming skill, can handle virtually any communication protocol. However, for classic communication via http, it can be used right "out-of-the-box". The test case that calls the selected URL and performs a basic check of the correctness of the response (return code 200) is as follows:
The endpoints object holds a data set loaded into memory, specifically a set of URLs from a csv file. The getRecord() method returns a randomly selected URL from the dataset. It is also worth noting that you can give a name to each call. Calls with the same name appear in the statistics as different instances of a single call even if in fact a different URL is targeted each time. This allows you to aggregate data already as the test runs.
Extended Basic and Editing Scenario
The basic scenario’s simulation of the web site users was combined with running processes in the CMS Azure application layer that were triggered by running PowerShell scripts. The script simulated an event that triggered a process (job) whose result was seen on the presentation layer. The length of the process was one of the monitored variables. For this purpose, a test case was implemented that ran the Azure job, captured whether on the given URL the expected change had occurred and measured the elapsed time. The code of this time test is very similar to the Editing scenario that we will now describe:
This test case uses data stored in the previews object. The data records can be utilized repeatedly, but a specific data record cannot be processed by more than one job. This is handled by the following methods
previews.setReady( preview[‘rowId’] )
The first method returns an unlocked data record and locks it against use elsewhere. The second method then releases the record for further utilization (see module dataReader.py above).
On lines 13–35 we execute the script that starts the job job = utils.process_file( site_name, preview_name, type="Preview") and logs the result with the script runtime:
Then after checking that the execution of the script did not end with an error if ( retVal == "ERR" ):..., we start the loop while proceeding:, which checks whether the content on the pages has changed. When the change is detected, we log it, again including the job run time.
The code repeatedly contains the command self.interrupt() that interrupts the script’s execution in the case of an error that would prevent the correct continuation of the test. The virtual user under which the test instance ran is then returned to the pool.
Managing a Mix of Test Cases within a Scenario
At the beginning of the article, we presented the configuration file that can be used to select, among other things, the scenario for a given performance test run:
LOAD_SCENARIO = 2
## 1 .. web users visiting sites (test case: WebTest)
## 2 .. web users visiting sites + content server processing workers output (test cases: WebTest + ContentServerTest)
## 3 .. editors requesting pages previews (test case: PreviewTest). Note: amount of virtuals users should fit amount of data records.
When scenario 2 is selected, the test runs two test cases that differ fundamentally in the way the data is handled:
- WebTest randomly selects one record (URL) from the dataset. It does not matter if the same URL is selected for multiple simultaneously running virtual users,
- ContentServerTest selects a record that is not being processed by another virtual user.
Before adding another virtual user to the test run, some logic is required that determines based on the state of the data set for ContentServerTest whether the virtual user can choose from both test cases or only from the first one. This is demonstrated in the following code:
The testRecord() method determines whether there is a free record in the given data object for processing (it will be locked only when a virtual user is actually activated using getRecord(), see above).
The variable self.tasks is then filled with a list of test cases that are considered for the running virtual user. Because we do not specify any weighting (see tasks attribute), Locust selects randomly from the pair [WebTest, ContentServerTest].
In the locust code repository you can check how the Locust project develops. Even in the last year a number of key features have been added. This indicates that the platform is being increasingly used on complex performance test projects.
Locust has proven itself to us many times and its qualities have demonstrated themselves. Let's highlight a few key features that are valued by our customers:
- Very fast realization of the initial set of performance test with the most critical scenarios (just a few days)
- Painless adoption of the performance test framework by the customer's development team
- Easy maintenance of existing test cases and implementation of new ones
We believe Locust will continue to help us create very effective solutions for our clients on future projects.