ScaleArc

Time-to-live (TTL)-based invalidation

Use the cache expiration method similar to NoSQL caches, yet is transparent to the application.

Time-to-live (TTL)-based invalidation is the most commonly used cache expiration method, because it is functionally similar to how most NoSQL caches work, yet is transparent to the application. ScaleArc lets you pick the read queries you wish to cache and define how long the cache should be valid, from 1 second to 999 days. The flexibility of being able to pick a unique TTL for each query type or pattern make it possible to use the cache for a wide array of query types. Once a query has been cached using the TTL-based invalidation method, it will keep getting served entirely by ScaleArc for as long as the TTL is still valid. The queries cached using the TTL method can also be manually expired using the API or the query hints.

The following simple examples are use cases that can be handled using TTL-based invalidation:

Range / Sort / Multi-Variable / Multi-Row Queries

Such queries are significantly more computationally expensive than simple point select queries. They also fetch a lot more records at the same time and hence are extremely I/O intensive. A lot of web applications use such queries for fetching top records for a listing or home page, such as “Top Selling Items” on an eCommerce website or “Latest Headlines” on a news website. Since such data does not change frequently and is not transactional in nature but tends to be queried quite regularly, it’s a good idea to cache such queries for between 1 to 10 minutes based on your business’s “freshness” requirement.

For example, a massive multiplayer gaming customer cached the query that fetches “Top 50 Players” out of a table of more than 20 million records with a TTL of 5 minutes. The customer achieved a 99% cache hit rate on that query and reduced their server I/O by more than 20% with one simple cache rule. The 5-minute time was chosen because that was the average amount of time users spent on a game session, and the “Top 50” list was shown on the screen after the game session ends.

Point Selects for Infrequently changing data / meta-data

Many applications use a lot of point queries to fetch details on items in a table that either never change or change very infrequently. A lot of metadata, which is used by applications to form correlations between different pieces of data, falls into this category. Some examples are:

  • Queries that fetch the city name based on a provided zip code, could be cached for 24 hours or more and provide over 99.9% offload
  • Queries that fetch the full name of a company based on a stock quote lookup, such as MSFT translating to Microsoft, Inc. can also be cached for 12 hours or more, as stock codes don’t change within a day, and again achieve 99.9% or more hit rate
  • Queries that fetch the details of a particular SKU from a products table in an eCommerce site could be cached for as little as 1-5 minutes for sites where item information changes frequently and still achieve 80% offload or greater in most cases –sites where the information changes less frequently could use a longer time and gain a larger offload

Large Aggregation / Reporting Queries on Old Data

A lot of applications run historical reports where they refer to data older than the current date, and that data doesn’t change at all once it’s been written. Typically, such query patterns are fairly resource intensive for the database to execute and lead to a lot of disk I/O as well. 

An example of this model could be queries that fetch historical graph data for stock quotes or a query that fetches details on all orders processed at a previous date. These kinds of queries are used by managers, admins, and analysts frequently and are accessed by many people in the same day. By caching such queries for 24 hours or more, you could offload a massive amount of burst database I/O and speed up viewing of such information by the users.