Building Event-Driven Data Warehouses: Lessons from Scaling 9-Figure eCommerce
How we built a real-time data pipeline using Pub/Sub and BigQuery that handles millions of events daily for Aroma360 and Hotel Collection.
When I joined z9 Ventures in 2023, both Aroma360 and Hotel Collection were generating massive amounts of data but struggling to turn it into actionable insights. Customer events were scattered across platforms, attribution was broken, and our executive team was making million-dollar decisions based on day-old reports.
Fast forward 18 months: we now process over 2 million events daily through a unified data warehouse that powers real-time dashboards, enables precise attribution modeling, and has directly contributed to scaling daily revenue from $100K to $400K.
The Challenge: Data Chaos at Scale
Running two 9-figure eCommerce brands means dealing with data from everywhere: Shopify, Google Analytics, Facebook Ads, call tracking systems, email platforms, SMS providers, and dozens of other tools. Each platform had its own data format, update frequency, and API limitations.
The existing setup was a patchwork of scheduled ETL jobs, CSV exports, and manual data pulls. Critical business metrics like customer lifetime value and marketing attribution were calculated differently across teams, leading to conflicting reports and delayed decision-making.
The Architecture: Universal Webhook Handler
The breakthrough came when I designed what we call our "Universal Webhook Handler" - a cloud function-based system that standardizes how we ingest events from any platform in real-time - designed primarily to work with any webhook subscription, but flexible enough to allow for event style processing for convential APIs .
How It Works
The architecture centers around parameterized webhook URLs that encode metadata directly in the endpoint. When setting up integrations with platforms like Shopify, Facebook, or our call tracking system, we generate URLs like:
https://webhooks.z9data.com?platform=shopify&event=order_created&business_unit=aroma360
https://webhooks.z9data.com?platform=klaviyo&event=sms_sent&business_unit=hotel_collection
Each URL contains four key parameters:
- Platform: The source system (shopify, facebook, callrail, etc.)
- Event Type: The specific action (order_created, purchase, call_completed)
- Business Unit: Which brand the event belongs to (aroma360, hotel_collection)
- Additional Context: Campaign IDs, user segments, or other metadata
The Ingestion Flow
When an event fires, our cloud function extracts the metadata from the URL path, validates the payload, and enriches it with additional context like timestamp, IP geolocation, and user agent parsing. The enriched event is then published to a Google Pub/Sub topic.
This approach has several key advantages:
- Zero Configuration: New integrations require no code changes, just URL generation
- Automatic Routing: Events are automatically tagged and routed based on URL structure
- Failure Resilience: Pub/Sub handles retries and dead letter queues automatically
- Scalability: Cloud functions scale to zero when idle, handle traffic spikes seamlessly
Real-Time Processing with Local Subscribers
While cloud-native solutions are great for scalability, we needed the flexibility and cost control of on-premises processing for our analytics workloads. Our solution: a locally hosted Python service that subscribes to the Pub/Sub topics and processes events in real-time.
This subscriber service runs on a dedicated server in our office and handles:
- Data Validation: Schema validation and data quality checks
- Transformation: Converting platform-specific formats to our unified schema
- Enrichment: Adding calculated fields, customer segments, and attribution data
- Database Insertion: Writing clean, structured data to PostgreSQL
PostgreSQL as the Foundation
All processed events land in a carefully designed PostgreSQL database with separate schemas for raw events, transformed data, and analytics views. We use materialized views extensively to pre-compute common aggregations and ensure sub-second query performance.
Key design decisions:
- Event Sourcing: We never delete or modify raw events, only append new ones
- Partitioning: Tables are partitioned by date and business unit for query performance
- Materialized Views: Pre-computed aggregations for common metrics (daily revenue, customer cohorts, attribution models)
- Indexing Strategy: Composite indexes on common query patterns (customer_id + timestamp, campaign_id + event_type)
Analytics and Visualization
With clean, real-time data flowing into PostgreSQL, we connected multiple BI platforms to serve different stakeholders:
- Grafana: Real-time operational dashboards for the engineering team
- Tableau: Executive reporting and deep-dive analysis for leadership
- Custom APIs: Direct database connections for our internal tools and applications
The materialized views are refreshed every 5 minutes, giving us near real-time visibility into key metrics like:
- Revenue by channel, campaign, and product
- Customer acquisition costs and lifetime value
- Attribution modeling across the entire customer journey
- Inventory levels and demand forecasting
Results and Impact
The impact of this architecture has been transformative:
- Revenue Growth: Enabled scaling from $100K to $400K daily revenue through better attribution and optimization
- Decision Speed: Executive team now has real-time visibility into business metrics
- Data Quality: Unified schema eliminated discrepancies between teams and reports
- Engineering Velocity: New integrations take hours instead of weeks to implement
- Cost Efficiency: Hybrid cloud/on-premises approach reduced data processing costs by 80%
Lessons Learned
Building this system taught me several important lessons about data architecture at scale:
1. Design for Change
The most valuable aspect of our webhook handler isn't the technology - it's the flexibility. By encoding metadata in URLs rather than hardcoding it in application logic, we can adapt to new platforms and requirements without code changes.
2. Embrace Hybrid Architectures
Pure cloud solutions aren't always optimal. Our hybrid approach gives us the scalability of cloud functions for ingestion with the cost control and flexibility of on-premises processing.
3. Materialized Views Are Your Friend
Pre-computing common aggregations through materialized views dramatically improved query performance and user experience. The trade-off in storage cost is minimal compared to the value of fast analytics.
4. Event Sourcing Pays Dividends
Maintaining immutable event logs has saved us countless times when debugging attribution issues or implementing new analytics models. The ability to replay events and recalculate metrics is invaluable.
What's Next
We're currently working on several enhancements to this architecture:
- Machine Learning Integration: Adding real-time ML models for fraud detection and personalization
- Data Mesh: Expanding the architecture to support additional business units and use cases
The foundation we've built has proven robust and scalable, handling everything from Black Friday traffic spikes to new product launches without missing a beat.
Want to discuss data architecture challenges or share your own experiences with event-driven systems?Drop me a line - I'd love to hear from you.