Real-Time Analytics Dashboard
Built a real-time analytics platform processing 100K+ events per second with sub-second latency
Overview
Designed and implemented a real-time analytics dashboard for monitoring user behavior and system health across multiple products
Problem
Product managers and support teams had no visibility into real-time user behavior. Existing analytics were batch-processed overnight, making it impossible to respond quickly to issues or opportunities.
Constraints
- Must handle 100K+ events per second during peak hours
- Query latency must be under 1 second for dashboard updates
- Budget limited to existing infrastructure
- No dedicated data engineering team
Approach
Built a streaming data pipeline using Kafka for ingestion, ClickHouse for storage and querying, and React for the frontend. Focused on pre-aggregating data at write time to enable fast queries.
Key Decisions
Choose ClickHouse over traditional OLAP databases
ClickHouse offers excellent compression and query performance for time-series data. Its columnar storage is perfect for analytics workloads.
- PostgreSQL with TimescaleDB
- Elasticsearch
- Druid
Pre-aggregate metrics at multiple time granularities
Pre-aggregating at 1-minute, 1-hour, and 1-day granularities allows instant queries regardless of time range selected.
Tech Stack
- React
- TypeScript
- Node.js
- Kafka
- ClickHouse
- Redis
- Docker
Result & Impact
- < 500ms for 95th percentileQuery Latency
- 120K events/second sustainedEvent Throughput
- 90 days at full granularityData Retention
Product teams now catch and respond to issues within minutes instead of days. The dashboard has become the primary tool for understanding user behavior and making data-driven decisions.
Learnings
- Pre-aggregation is essential for real-time analytics at scale
- ClickHouse's materialized views are powerful but require careful schema design
- WebSocket connections for live updates need proper backpressure handling
Technical Deep Dive
The key innovation was using ClickHouse’s materialized views to automatically pre-aggregate data as it arrives. This eliminated the need for separate aggregation jobs and kept query latency consistently low.
We also implemented a smart caching layer using Redis to handle repeated queries for popular dashboards, further reducing load on ClickHouse.