Back to Projects
workFeaturedFidelity

SLO Tracker

Centralized observability platform reducing Mean Time to Detection by 40% across 50+ engineering teams.

Overview

Challenge

Fidelity's 500+ microservices lacked unified reliability metrics. Teams couldn't track SLOs consistently, leading to slow incident detection and alert fatigue.

Solution

Built end-to-end SLO platform with alerts-as-code, error budget tracking, and automated breach notifications. Integrated OpenTelemetry across all services with a React dashboard for visibility.

Impact

Reduced Mean Time to Detection by 40% across 50+ teams. High-cardinality metrics POC cut alert noise by 50%, saving 20 engineering hours weekly.

Tech Stack

PythonFastAPIAWS LambdaPostgreSQLReactTypeScriptTerraformOpenTelemetry

Key Metrics

  • 40% reduction in Mean Time to Detection
  • 50+ engineering teams onboarded
  • 20 engineering hours/week saved in alert triage

Interested in discussing this project or similar work?

Get in Touch