EC2 maintenance costs cut by 40% monthly with dynamic scaling and immutable infrastructure

One of the most influential players in the US advertising market faced a significant costs challenge with their AdTech platform, which supports billions of requests per second. It’s infrastructure was serverless and hosted in a cloud environment, requiring a large amount of computing resources to manage such high traffic volumes, leading to monthly million-dollar bills due to the extensive use of cloud resources.
To address these challenges and reduce monthly costs, our team implemented several solutions. First, we utilized AWS Auto Scaling to dynamically adjust the number of EC2 instances based on the most recent traffic statistics and long-term weekly observations. By analyzing traffic patterns, our engineers were able to create a dynamic scaling policy where EC2 machines were scaled out according to the most recent average requests volume and the time of day. This ensured that the number of nodes was always balanced to handle incoming traffic with minimal reliable hosts.
In addition, the infrastructure was designed to be auto-healing, allowing healthy hosts to replace unhealthy ones without any manual intervention. Our immutable approach meant that misbehaving hosts were taken down, with automation handling everything else. Our engineers also used ongoing metrics analysis with DataDog to implement a tailored scaling policy. By leveraging automation tools such as CloudFormation, Ansible, and Jenkins, we designed and developed a simplified and more efficient release process.
The implementation of these solutions led to significant benefits for the client. The dynamic scaling policy and immutable infrastructure reduced EC2 infrastructure maintenance costs by approximately 40% monthly. This resulted in substantial savings for the client, lowering their overall cloud expenditure. The auto-healing infrastructure ensured that the system remained highly reliable and resilient, with minimal downtime, maintenance or manual intervention. The use of automation tools streamlined the release process, making it more efficient and reducing the time required for deployment and updates. The tailored scaling policy ensured that the AdTech platform could handle fluctuating traffic volumes efficiently, maintaining optimal performance levels at all times.


