Creating Live Counter on Product Page
To design a fault tolerant and resilient system that gives the customer live counts of all the customers present on a particular product page.
Functional Requirements
- When the user logs in and views a product page, live counter of the product page should increment for all viewers
- When the user moves out of the product page, either by closing the tab, logging out or system going down, live counter for the product should decrement
- Counters should be refreshed every 5 secs or any configurable time interval
- System needs to count only logged in users
Non-Functional Requirements
- Counters should be eventually consistent, i.e. it's ok if the counters take a small time to get updated
- System should be resilient and scalable
- Since live counter is a good to have feature, it shouldn't block any critical path
API Design
- getLiveCount(productId) -- Fetches live count of the users present on this product page
High Level Design
Flow Explanation :
- Client makes a request to API gateway to fetch product page
- Gateway forwards request to Product Service which returns the product details
- Product Service pushes a Product Viewed event every time it receives a get request for live count.
- Counter updater pulls the event and updates the counter in redis
- Counter Updater updates counter in redis
- Client makes another call to fetch updated live counter at regular intervals
- Gateway forwards the call to product service
- Product Service retrieves the updated counter from Redis
Structure of the Queue Payload
Event pushed in to the queue would be as follows
{
productId : <productId>,
userId : <userId>,
timestamp : <timestamp>
}
Structure of Key Value stored in Redis
Redis needs to store the data in such a fashion that we can easily fetch the live count of the product easily. At the same time, we need to make sure, that if we are not getting product live counter poll for certain time, then that userId automatically stops contributing to the live count. For this we can create buckets of x seconds. And these buckets will have all the users that polled for live counter during that time. And since these buckets will keep on changing with time, user who did not asked for live counter after x secs, will automatically be absent from the count.
Key : <product_id>_<timestamp to nearest 5 secs>
Value : Sorted Set of all the users who retrieved live count for the product
Counter Updater
When counter updater consumes the message from queue. It does the following processes
- Reads the timestamp from the message
- Finds the nearest bucket for the timestamp and updates the sorted set of redis with that bucket and the next bucket
- The next bucket is updated to pre-warm the entries when the switch happens between buckets.
Possible Gotchas/Problems :
- This system is highly sensitive to latencies. In case consumer does not consumes the messages quickly we might not show correct data. For this consumer should be auto-scaled
- Since the system, relies on events pushed in the queue. Idempotency is a pretty high requirement. Values stored in redis is a sorted set, hence idempotency will be taken care of.
- Since there is a lag between when user moves out of the page and when the count is removed from live count. There is a chance that same user can be counted in more than one pages for a small time.