Achieving High-Performance Load Testing with JMeter Clusters

Engineering @ PhysicsWallah
5 min readJun 16, 2023

By Shishir Khandelwal, DevOps Team

At Physics Wallah, we’re on a mission to revolutionize education. As a growing edtech startup, we’re focused on delivering quality education to students at an affordable price. Our goal is to remove any financial barriers that may prevent students from accessing the education they need to succeed.

But with this mission come some exciting technical challenges. Just like the teams of Flipkart’s Big Billion Days and Alibaba’s Singles Day, we too strive to deliver reliable, highly scalable systems to handle the 6x of the usual traffic during big launches & sale events.

The graph above shows how spiky our traffic becomes during big launches and free classes, this number has crossed 1.3 million! This is why, load testing is an important part of our development cycle.

As we gear up for the next phase of growth, we knew we needed to revamp our load-testing approach. We needed to be able to simulate 6x the amount of traffic on our website to ensure we’re prepared for the future. So, we decided to implement a distributed load testing approach using JMeter, the open-source load testing tool. Inside Kubernetes.

We’ve revamped our load testing approach and we’re excited to share it with the world.

Old approach

In the past, when we wanted to do load testing, we would spin up multiple windows virtual machines (VMs). Our QA team would then use SSH to access each VM and start a JMeter slave process. This process was manual, and the data needed for the load testing had to be uploaded to each slave individually. To make sure that each slave was testing with different data, we also had to manually split the data files into smaller parts that would go into each slave node. It was a time-consuming and tedious process.

Doing a single load test was very difficult and time-consuming. At Physics Wallah, we needed to do many load tests every week because we have a lot of different products.

This was a big challenge for us because it was also not easy to increase the number of tests we wanted to run. For example, if we wanted to test our system with 10,000 users, we had to figure out how many virtual machines we needed. This was made even harder because different tests needed different amounts of memory and CPU resources. This data suggested a necessary shift from using virtual machines, which have a slower startup time, to prioritizing containers that can come online much more quickly.

If we wanted to test with more users the next day, we would have to start the process all over again and it was likely that we won’t find the exact numbers till we fail 2 or 3 times.

Possible solutions

We tinkered with a lot of ways to make the load-testing process easier. We thought about using tools like Terraform to automatically set up virtual machines, Ansible to start the JMeter processes, or shell scripts to divide the data. We also considered using a paid solution.

You see, we discussed different approaches to make the load testing process easier, but they all had their limitations in terms of scaling and maintenance. They seemed like temporary solutions that would still require a lot of time and effort from our DevOps and QA team, especially when considering the cost. We needed a more automated and scalable solution. Most significantly, our aim was to find a solution that would continue to function with minimal maintenance needs.

Enter: Kubernetes!

New solution

We implemented a highly scalable approach using Kubernetes. We set up a cluster that could automatically scale up when needed.

Our approach was as follows:

  • We uploaded our JMeter test plans and required data files to an S3 bucket.
  • We used a Stateful Set to start multiple JMeter slaves. Each slave would download the test plan and data files from S3, divide the data file into multiple parts, and start using one.
n=SLAVE_COUNT
echo $n
s=$(( l/n ))
echo $s
split -d -l $s data.csv file_ --additional-suffix=.csv --suffix-length=4
ls | grep .csv
  • We used a Kubernetes Job to start a JMeter master. This master would also download the test plan and data files from S3. It would connect to all the slaves using their internal Kubernetes addresses and start the load test.

The beauty of this solution is that we had more control over the resources used by each slave. For example, we could tell a slave to test with a low number of threads, using fewer resources and finishing faster. We could also increase the total number of slaves to test with a high number of total threads. For example, if each slave runs 1000 threads, we could provision 100 pods to achieve a load-testing scenario of 1000*100 ~ 100K threads.

  • We also automated the report creation process on the master. All results were uploaded to S3 for future reference.
echo "Load Test Completed."
echo "Uploading Results CSV to S3."
date=$( date '+%Y/%m/%d' )
echo "Generating HTML report now."
/apache-jmeter-5.5/bin/jmeter -g <RESULTS FILE>.csv -o report
echo "Uploading HTML report now."
aws s3 cp /apache-jmeter-5.5/bin/report s3://<PW BUCKET>/APPNAME/$date/$buildNumber/ --recursive
echo "All Operations Completed. Download report using: 'aws s3 cp --recursive s3://pw-load-test-results/APPNAME/$date/$buildNumber/ .'"
echo "You can also view the report by going to 'https://<PW INTERNAL URL>/APPNAME/$date/$buildNumber/index.html'"

Overall, this Kubernetes-based approach allowed us to perform load testing in a highly scalable and automated manner. It gave us more control over the resources used and made it easier to increase the number of tests we could run.

Making it developer and QA friendly

We made it easy for anyone to use our Kubernetes-based approach by abstracting away the complexities of Kubernetes. We did this by using a Helm chart to control each aspect of the Kubernetes YAML files. Now, anyone can make changes to the values file and start testing without needing any prior Kubernetes experience or help from a DevOps team.We drove this change in approach using our central CICD on Jenkins.

This tool has made the life of the QA team easier. They can now launch massive load tests quickly with no manual efforts requirements.

Next steps

In terms of future enhancements, there are several areas that could benefit from further refinement. One potential improvement involves parameterizing additional elements and allowing them to be controlled through helm values files. For instance, we could consider incorporating an option within the helm chart that enables direct control over the number of threads used. Additionally, providing users with the capability to allocate more heap memory for the JMeter master would be advantageous. These adjustments would contribute to a more flexible and customizable load testing experience.

--

--