The client sought to transition to a more cost-effective monitoring provider while also unifying metrics and reporting platforms across their organisation. The migration had to be completed by a strict deadline, adding significant time pressure to the project. Furthermore, the client faced inconsistency in how their teams approached telemetry, leading to fragmented monitoring practices. For this reason, the client needed to establish standardised telemetry guidelines that all teams would be required to adopt.
We developed a comprehensive migration plan to meet the client’s deadline and unify their monitoring metrics. By understanding the client’s requirements, we not only facilitated a smooth transition but also enhanced the system for improved troubleshooting. Our approach involved refining the metrics during migration to ensure they were more relevant and actionable across the client’s services.
Our process began with identifying the migration scope, including dashboards, alerts, and compliance needs. We then used a single service as a testbed, where we reviewed and optimised metrics, implemented standardised dashboards, and created reusable templates. This approach was replicated across all services, supported by custom scripts to simplify metric exports. Finally, we integrated Datadog Agents for system monitoring and successfully decommissioned the old monitoring solution.
The project resulted in a unified metric solution with accurate alerting and increased visibility into service performance, even outside target teams. By using standardised metrics and Datadog’s Universal Service Metrics, the client achieved better correlation between metrics across services through tag standardisation. The cross-team adoption of standardised PowerPacks and dashboards enhanced collaboration and efficiency. The new solution proved three times more cost-effective than the previous one – reducing monthly costs by approximately $100k for log optimization alone – offering a significantly improved experience. Additionally, the migration paved the way for an SRE initiative, and led to the implementation of standardised telemetry guidelines that all teams were required to adopt.


