Soren

Strategies for Selective Cache Expiration

Production applications quickly rack up gigabytes of cached data spanning hundreds of thousands of cache keys. Most of that data is fresh, but every once in a while you'll find a particular family of entries is holding stale data. Databases that large are holding a humongous amount of cached data, which is being cached for a reason—it is expensive to compute. It would be wasteful to blow away the entire cache just to recompute a fraction of the data.

There are at least three discrete strategies to solve the issue of expiring targeted data. Which one to use depends on how the keys have been composed, how the data was generated, and exactly what has become stale.

Expiration via Touching

When the cached data is referencing a database record, and has a key that is based on the timestamp of a record, you can touch it to bust the cache. The next time the data is fetched the key will have expired and you'll get fresh data. For more in depth information on key composition see essentials of cache expiration.

With a narrow collection of records touching is easy and targeted. Large collections, hybrid caches of multiple models or unbounded range (like every record in the database) are not well suited to purging via touching. When the situation is right you can use methods like touch or update_all in Rails to bump the timestamp on one or more records:

child.touch # touch a single record

parent.children.update_all(updated_at: Time.now) # touch all records

Expiration via Targeted Versioning

Occasionally the data in the cache is fresh, but you need a different view of it. Maybe an API client requires a new field, fewer fields, or more associated records. In this case there isn't any point in touching the records. Views and serializers need to be updated, which is your opportunity to bundle the expiration. This is a job for targeted versioning.

Note the word targeted is being used. It is possible to uniformly version the entire cache by updating the namespace. Much like changing an API from v1 to v2, you prepend the cache with a version. Targeted versioning is similar, but the version change is scoped to the view or serializer in question. For caching within a Rails view this is as simple as composing the key from an array rather than just the model. For example:

cache [model, 'v2'] do
  # fragment to cache
end

cache [model, 'v3'] do
  # new fragment to cache
end

Expiration via Selective Purging

Recently, on a client project, a situation arose where a large section of the cache had to be purged, but neither touching nor versioning would work.

Their application caches large trees of API data, many parts of which contain embedded user data. The embedded user data includes a few avatar URLs, all of which were securely checked out from Amazon S3 and have an expiration. A background job keeps the URLs refreshed, but that hadn't been accounted for in the cache. The end result was a lot of 403 Forbidden requests when the browser tried to load the embedded expired avatars.

Touching won't help here, because the user record isn't cached directly, and they aren't part of the cache key for the parent record. Versioning isn't well suited either, as the fields don't need to change, the underlying data is out of sync. That's a lot of wind up for what I'm about to suggest: delete the exact keys that have expired.

Avoid purging the entire cache by using a targeted tool like delete_matched, as provided by ActiveSupport::Cache. Most of the available caches support matching using regular expressions, though some only support globbing.

Rails.cache.delete_matched("posts/9[0-1]*")

Note: Until very recently my readthis cache for Redis didn't support delete_matched due to concerns about performance and the evil keys command. The eventual implementation uses SCAN and is entirely safe to use with gigantic databases. The aforementioned client was using Readthis for caching, driving the need for delete_matched to be implemented.

The Right Expiration Strategy for the Job

Caching is key to a highly performant application, but stale data can be insidious. Without targeted expiration we start to reach for blunt tools and expire too broadly. All of the expiration strategies presented here are simple, and they come up often in a production system. Recognize the situation and choose the right strategy for the job.