Mon 10 Mar 2008
Dog-pile Effect and How to Avoid it with Ruby on Rails memcache-client Patch
Posted by Scoundrel under Databases , DevelopmentWe were using memcache in our application for a long time and it helped a lot to reduce DB servers load on some huge queries. But there was a problem (sometimes called a “dog-pile effect”) - when some cached value was expired and we had a huge traffic, sometimes too many threads in our application were trying to calculate new value to cache it.
For example, if you have some simple but really bad query like
which could be really slow on a huge tables, and your cache expires, then ALL your clients calling a page with this counter will end up waiting for this counter to be updated. Sometimes there could be tens or even hundreds of such a queries running on your DB killing your server and breaking an entire application (number of application instances is constant, but more and more instances are locked waiting for a counter).
So, how could we avoid such a problem? First thing came to my mind was: “What if we’d mark old counter as ‘expired’ and then only one thread would re-calculate a counter while all other clients would use old value?”. The idea looks great, but when we cache something in memcached, we it is hard to say when a value vas saved to the cache and when it is going to be expired. After a small research I’ve found a much more elegant solution: we could create two keys in memcached: MAIN key with expiration time a bit higher than normal + a STALE key which expires earlier. So, when we try to read a value from memcached, we try to read STALE key too. If it is expired, it is time to start re-calculation (and set STALE key again with some short TTL).
Final solution we end up using is following (monkey patch for the ActiveRecord::Cache class from the RobotCoop’s memcache-client library):
module ActiveRecord
class < < Cache
STALE_REFRESH = 1
STALE_CREATED = 2
# Caches data received from a block
#
# The difference between this method and usual Cache.get
# is following: this method caches data and allows user
# to re-generate data when it is expired w/o running
# data generation code more than once so dog-pile effect
# won't bring our servers down
#
def smart_get(key, ttl = nil, generation_time = 30.seconds)
# Fallback to default caching approach if no ttl given
return get(key) { yield } unless ttl
# Create window for data refresh
real_ttl = ttl + generation_time * 2
stale_key = "#{key}.stale"
# Try to get data from memcache
value = get(key)
stale = get(stale_key)
# If stale key has expired, it is time to re-generate our data
unless stale
put(stale_key, STALE_REFRESH, generation_time) # lock
value = nil # force data re-generation
end
# If no data retrieved or data re-generation forced, re-generate data and reset stale key
unless value
value = yield
put(key, value, real_ttl)
put(stale_key, STALE_CREATED, ttl) # unlock
end
return value
end
end
end
Since it is a monkey patch, you need to place this piece of code wherever you want, but it should be used AFTER memcache-client is loaded (for example, you can put it to your config/initializers/ directory or just copy-paste to your environment.rb. And example usage of this patch is following:
Cache.smart_get('test') { some_huge_calc }
# This would cache your calculation results for a 160 and will re-generate cache in 100 seconds
Cache.smart_get('test', 100) { some_huge_calc }
# This would cache your calculation results for a 120 and will re-generate cache in 100 seconds
Cache.smart_get('test', 100, 10) { some_huge_calc }
So, this is it - with a simple change we’ve fixed really annoying problem and made our application much more stable.
- Distributing Modified Perl Modules With Your Application
- Using Nginx, SSI and Memcache to Make Your Web Applications Faster
- 32bit VS 64bit - what do you use?
- Looking For Optimal Solution: Ruby On Rails and Mongrel
- HAProxy - The Reliable, High Performance TCP/HTTP Load Balancer
2008-03-10 at 11.39 pm
If you’re using the cache_fu Rails plugin, you can use reset_cache instead of expire_cache to avoid the ‘dog-pile’ effect.
2008-03-11 at 12.31 am
[...] Fixing the dogpile effect in memcached (tags: memcached rubyonrails) [...]
2008-03-11 at 6.03 am
2Chris: Of course, but what if your data has expired because it was too old?
2008-03-11 at 2.29 pm
You can do this easily without doubling every lookup (and without using more memcached space):
http://www.socialtext.net/memcached/index.cgi?faq#how_to_prevent_clobbering_updates_stampeding_requests
2008-05-06 at 10.30 pm
For some sites that i admin, and for some zones of it, I usually update the cache just for localhost. All the content coming from “!local_request?” show cache. Then in a simple crontab on the same machine will wget some urls to update cache. I know it is very monkey, but it run nice.