End-to-End Tracing
Built a framework linking user device tracing with backend service tracing for complete request visibility
Overview
Distributed tracing framework connecting client-side tracing with backend Zipkin-based service tracing, enabling end-to-end request debugging
Problem
Tracing was fragmented between client and backend; engineers couldn't follow a request from user action through all backend services
Constraints
- Must integrate with existing Zipkin infrastructure
- Must work across mobile, web, and TV platforms
- Must maintain low overhead on client devices
Approach
Built a trace correlation layer that propagates trace context from device through all backend services, storing traces in a unified format
Key Decisions
Extend Zipkin rather than build custom tracing
Zipkin was already deployed; extending it minimized migration cost and leveraged existing expertise
Use probabilistic sampling on clients
Tracing all requests would overwhelm storage; sampling keeps costs manageable while maintaining statistical significance
Tech Stack
- Java
- Zipkin
- Cassandra
- AWS
Result & Impact
Enabled debugging of cross-system issues that were previously impossible to trace
Learnings
- End-to-end visibility requires client participation
- Probabilistic sampling is essential for cost control
- Trace context propagation must be standardized
Architecture
End-to-End Tracing consists of:
- Client SDKs (iOS, Android, Web, TV)
- Trace context propagation middleware
- Backend service instrumentation
- Unified trace storage (Zipkin + extensions)
- Correlation UI linking device and backend traces