Posted By: TerraAlto – August 9th 2024
About the Customer
Irish Times DAC is one of Ireland’s most widely read daily newspapers, delivering trusted news coverage since its founding in 1859. Based in Dublin, the publication provides comprehensive reporting on national and international news, politics, business, culture, and sports. It has a strong digital presence, offering an online edition and mobile app to serve a growing audience.
Customer Challenge
The Irish Times AWS environment is comprised of AWS-based digital experience platform Arc-XP (www.arcxp.com), and an AWS serverless integration platform built using AWS API Gateway, AWS Kinesis, AWS Elastic Container Service (ECS), and AWS DynamoDB to seamlessly distributes Arc-XP mastered content across print and digital subscriber content channels.
Prior to engaging TerraAlto as AWS managed service provider, the company struggled with:
- Excessive Noise Alerts: The existing monitoring system generated over 2,000 alerts a month. The flood of alerts overwhelmed the IT operations team, making it difficult to distinguish between real issues and false positives. This constant noise led to alert fatigue and raised the risk of delayed responses to critical incidents.
- Lack of Precise Problem Identification: With the noise came difficulty in identifying the root causes of incidents. The system lacked the precision required to pinpoint business-impacting events, making troubleshooting slow and inefficient.
- Limited Predictive Capabilities: Monitoring tools in place provided reactive insights, which meant the team could only respond after an issue had occurred. This reactive approach often led to operational disruptions that impacted both users and business operations.
Partner Solution
To solve Irish Times DAC’s monitoring challenges, TerraAlto introduced a number of monitoring optimisations using the following AWS observability services and features:
- AWS CloudWatch Metrics and Alarms: TerraAlto redefined and fine-tuned CloudWatch metrics and alarms across all critical services, including API Gateway, Kinesis, Elastic Container Service (ECS), and DynamoDB. They created custom metrics for key performance indicators such as API request latency, ECS task failures, Kinesis stream throughput, and DynamoDB read/write capacity. By aligning alarms with these metrics, TerraAlto ensured that only actionable and critical alerts were generated, minimizing unnecessary noise and improving response times.
- AWS Application Insights: TerraAlto implemented AWS Application Insights to provide deeper visibility into the application layer, focusing on monitoring the health of Irish Times DAC’s ECS containerized microservices and API Gateway. Application Insights automatically detected application performance problems, such as high API response times or ECS container crashes, and provided contextualized insights into root causes. By integrating Application Insights with CloudWatch, Irish Times DAC gained automatic anomaly detection at the application level, significantly reducing manual monitoring efforts.
- AWS DevOps Guru: TerraAlto integrated AWS DevOps Guru, a machine-learning-based service that automatically detects operational issues and suggests remediation actions. By analyzing patterns in the Kinesis data streams, DynamoDB performance, and the interactions between ECS tasks, DevOps Guru identified potential bottlenecks before they turned into critical failures.
- AWS CloudWatch Anomaly Detection: One of the key enhancements was the use of CloudWatch Anomaly Detection to automatically identify abnormal behavior in API Gateway requests and Kinesis stream data throughput. By training models on historical performance data, Anomaly Detection set dynamic thresholds rather than static ones, enabling more precise and intelligent alerting. This reduced false positives and ensured that the system only triggered alarms for truly anomalous events, improving the relevance and accuracy of notifications.
- AWS CloudWatch Application Signals: Lastly, CloudWatch Application Signals was used to further enrich alerts with actionable information. By collecting performance data from various layers—such as infrastructure, applications, and services—Application Signals allowed the team to correlate disparate Service Level Indicators (SLIs) into business-relevant Service Level Objectives (SLOs)
Benefits
Upon completion of the project, Irish Times DAC experienced significant improvements in how they monitored and managed their AWS environment. The following benefits were realized through the use of the enhanced AWS monitoring tools:
- Significant Reduction in Noise Alerts: One of the most immediate benefits was the drastic reduction in alert volume. TerraAlto’s optimisation of CloudWatch metrics and alarms reduced noise alerts from over 2,000 per month to an average of fewer than one alert per day. CloudWatch Anomaly Detection further contributed to this by preventing unnecessary alarms for minor deviations, focusing only on genuinely anomalous events.
- More Precise Identification of Business-Impacting Events: The combined power of CloudWatch Anomaly Detection, AWS DevOps Guru, and Application Insights allowed Irish Times DAC to identify and isolate critical, business-impacting events with greater precision.
- Ability to Predict and Prevent Issues Before Business Impact: One of the transformative benefits of the project was the introduction of predictive monitoring capabilities through AWS DevOps Guru and CloudWatch Anomaly Detection. These services identified patterns of unusual behavior, such as a gradual increase in DynamoDB write latency or an ECS container using an unexpected amount of CPU, providing early warnings to the Irish Times team. Predictive alerts helped prevent issues before they could impact critical services.
- Improved Operational Efficiency: With fewer noise alerts and more accurate insights, the IT operations team at Irish Times DAC was able to focus on high-priority tasks, improving overall operational efficiency.
Conclusion
The monitoring enhancements implemented by TerraAlto transformed the way Irish Times DAC managed its AWS infrastructure. The project not only solved their immediate issues related to alert fatigue and poor visibility but also introduced predictive and automated insights that enabled the company to operate more efficiently and with greater confidence.
About The Partner
TerraAlto, a RICOH Group company, is an AWS Advanced Consulting and MSP Partner with an AWS DevOps Competency, who help organisations build and manage advanced solutions utilising AWS services for big data, IoT and enterprise data platforms. With an extensive experience of delivering greenfield implementations, migrations and application innovations, TerraAlto have worked with a wide range of client organisations ranging from tech start-ups to large global organizations.
As an approved member of the AWS Managed Service Provider Partner Program, TerraAlto provides 24-7 managed services for AWS environments in Europe and in Asia (including China). As an active participant in the AWS Well-Architected Framework Program, TerraAlto helps their clients to ensure secure and compliant usage of AWS services including the building of automated compliance frameworks for enterprise-wide exception alerting and reporting.