Our client is a French fashion store that sells clothing and accessories, which is in the middle of migrating from Prestashop to Magento 2.3 Commerce Edition. When they start to pick a hosting environment, they decide to land on Amazon Web Services (AWS) and ask us for help with the DevOps part.
The client also expects a high traffic load during the period of sales. Based on forecasts for advertising campaigns, they expect that the number of users who visit the online store at the same time will reach up to 10,000, so we are given the task of creating a server infrastructure, which can handle the load during sales and discounts.
We always begin working on a project by analyzing the environment, planning our next steps, and meeting with our key specialists to outline future architecture.
During the analysis of this project, it becomes clear that peak loads occasionally occur during the year, which means that we face creating a scalable environment that can adapt to the load and consists of the following tools:
- Fastly CDN significantly accelerates the delivery of content cacheable and uncacheable (product prices, other dynamic and event-driven content), cuts off the low-quality traffic, and provides robust protection against DDoS attacks essential for high-traffic businesses.
- Elastic Load Balancer shares and manages the load within the multi-server architecture, as we have several servers on which Magento is hosted on.
- Deployment consists of the Git client’s repository and the AWS code deploy with Bitbucket Pipelines, which allows automatically building, testing, and deploying the code, based on a configuration file in the repository. As a result, commands can be run with the advantages of a fresh system inside the containers in the cloud.
- Autoscaling group is an Amazon solution for adding and removing servers when necessary — the more servers are involved in the group, the higher load they can carry.
- Databases — we work with the Amazon Aurora database managed by Amazon Relational Database Service (RDS), which automates administration tasks. We select two powerful ones — Master and Slave with a synchronization mechanism. The Master database server is responsible for the recording process, and Slave is a place where the reading is coming from. As the recording is a more time-consuming operation than reading, we plan to separate them to achieve the necessary performance.
- Caching — a full-page cache (FPC) and object Magento cache (OMC) are implemented to improve the response time and reduce the load on the server, as a fully-generated page can be read directly from the fast cache memory. The next step is user sessions, as logged-in users significantly load the infrastructure, so we decide to store and work with them separately from the Magento servers. We select the Redis caching service that allows inserting and retrieving a massive amount of data into its cache within a short period, which can be done easily using mass insertion, a feature supported by Redis.
- Elasticsearch offers a fast and personalized search experience and allows users to find relevant data quickly because of its proven performance and direct access to the APIs. We use this tool to upgrade the default Magento search, which doesn’t always provide as relevant search results.
- Bastion provides full security and minimizes the chances of the entrance from the outside. It also has a multi-factor authentication and extra security layer to prevent unauthorized administrative access to systems.
- Load testing is performed to understand how the infrastructure is going to work and be in line with what is required. Our QA Engineer performs the procedure, verifies the ultimate capacity, and gives feedback with the metrics. We choose JMeter for load testing, which examines the server layer and discovers the maximum load a website can handle by simulating users sending requests to a target server. We usually combine JMeter’s output with New Relic, which analyzes how the code performs, indicates the problems on the Magento side, and offers in-depth reporting.
We visualize the architecture in the scheme below and make calculations of all of the specified tools for the client.
Optimization & Execution Stage
After getting the client’s approval, we set up the project execution stage and start with Fastly CDN, Elastic Load Balancer, the deployment process with Bitbucket Pipelines, Elasticsearch, and Bastion. We also introduce a range of changes and improvements to the architecture:
- Autoscaling group is implemented with the following rule: from the moment of promotions, when the average load on the servers reaches 70%, the system provides a new server to balance the overall performance. When the demand goes under 70%, the number of servers reduces one by one. We decide to start with 1-2 servers in the autoscaling group on which Magento will be hosted to test the minimum necessary number of servers and optimize the client’s costs.
- Databases — after conducting the tests, one database server appears to be enough instead of two, so the client doesn’t have to buy a Slave server and can save money.
- Storage space is another highlight, as we find a way to optimize the infrastructure to increase the speed of content loading. We implement the Network File System (NFS) media library for storing content that doesn’t need to be duplicated on each Magento server.
- Caching — when we start implementing FPC and OMC, which should be taken to separate servers to speed things up, we face a challenge. Based on the plan we choose for the client, the Amazon internal network bandwidth doesn’t allow transferring the info from the cache servers to the Magento servers quickly enough. After a quick brainstorm with the team, we land on leaving one cache server for the sessions (Redis) and moving FPC and OMC to each Magento server in the autoscaling group. As a side effect, we save the client’s budget by reducing the number of Amazon services without loss of efficiency.
- Load testing is an iterative process of putting demand on a system and measuring its response. The process of load testing consists of two parts — the server and application layers. After performing the load testing and getting the metrics, we give instructions to the server and application parts and fine-tune the infrastructure. The intermediate results with the dynamics by the number of requests and periods can be found in the New Relic graph below. We perform the test on 10 Magento servers in the autoscaling group and reach 7,635 users. It allows us to understand the maximum capacity that the system can carry at a particular moment.
During the load testing, we follow the common user scenario, then we analyze the summary reports from JMeter, which show all the data for each stage. There we can find the bottlenecks and fix them in certain places:
Eventually, the system turns out to be even more optimal than planned. This is how it looks:
- made it possible for the online store to work well when it’s loaded with 10,000 simultaneous users;
- completed the project from start to finish in a month;
- assisted the client’s team with the extra development hands to work on the AWS configuration and meet their deadline;
- described all the processes, sent instructions, and trained the client’s team so that they can smoothly work and support the project without our engaging in the future;
- improved the project infrastructure, while working on it;
- found a way to optimize the project pricing for the client and decrease the costs.