How to Paginate Large Records Efficiently in Your Backend System?
Imagine scrolling through a Slack channel or a hot Reddit discussion. New messages and comments don’t load all at once — they appear as you scroll. Behind the scenes, this seamless experience is powered by efficient backend techniques like pagination
As backend engineers the problem that we try to solve here is “give me the N most recent messages that occurred before a certain timestamp”. To implement this we introduce pagination which is a common technique used to break up a large set of results into smaller chunks.
Pagination is often implemented using 2 approaches:
- Offset Pagination
- Cursor Pagination
Offset Pagination
This is the simplest approach to pagination where we specify the starting point for fetching the results. For example consider this slack api request
Example request: GET /messages?channelId=id-1&offset=0&pagesize=10
In this request we are asking for messages from 0 offset with a page size of 10. This will fetch the first 10 messages.
SELECT * FROM channel WHERE channel_id = %id-1 ORDER BY id DESC LIMIT 10 OFFSET 0;
However this becomes a bottle neck when we have huge amount of data.
In the image above, you can see how offset pagination selects rows from a database table using an offset and limit
The database must count through all rows preceding the offset for each query, leading to slower response times with larger comment volumes. Most importantly, offset pagination is not stable. If a message is added or deleted while the user is scrolling, the offset will be incorrect and the user will see duplicate or missing messages.
Cursor Pagination
Cursor pagination is a technique that uses a cursor to specify the starting point for fetching a set of result. A cursor is a unique identifier that points to a specific item in the list.
In the initial API call, the frontend only provides the limit (the cursor is not included). The database returns a cursor pointing to the last item within the dataset in its response. In subsequent API calls, the frontend includes both the limit and the cursor value (pointing to the last records retrieved from the previous request).
Example request: GET /messages?channelId=id-1&messageId=1&pagesize=10
In SQL we will do something like this
SELECT * FROM books WHERE messageId > 1 ORDER BY messageId LIMIT 10;
In DynamoDb we have LastEvaluatedKey feature which we can use.
When to use what?
The choice between offset pagination and cursor pagination depends on the use case and the trade-offs you are willing to accept. Below are some guidelines:
Offset Pagination
- Best for Small Data Sets: When the dataset is small and performance is not a critical concern.
- Simpler to Implement: Offset pagination is easier to implement and understand, making it ideal for quick prototypes or simple applications.
- Supports Random Access: Since you can directly jump to any offset, this approach is useful when users might want to navigate to a specific page (e.g., page 5 of search results).
Limitations:
- Slower performance with larger datasets due to the need to scan and skip rows up to the offset.
- Instability when the underlying data changes (e.g., new messages or deleted items).
Cursor Pagination
- Best for Large and Dynamic Data Sets: When dealing with massive datasets or applications where new data is frequently added, such as chat applications or social media feeds.
- Stable Results: Cursor pagination avoids duplicate or missing entries, even when data is inserted or deleted during navigation.
- Better Performance: It eliminates the need to count rows up to the offset, resulting in faster query times, especially for large datasets.
Limitations:
- Sequential Access: Cursor-based pagination does not support random access to pages; users can only navigate sequentially.
- Complexity: Implementing cursor pagination is slightly more complex as it requires careful handling of cursors and their encoding for security.
Real-World Examples:
Offset pagination is often used in search engines like Google, while cursor pagination is commonly seen in chat apps like WhatsApp or Slack.