Built-in cache vs memcached


#1

Hey there!

Aside from the fact that the built-in in-RAM cache will be sent into oblivion when a server restart occurrs, what are the pros and cons of using the built-in cache or memcached when using Zato’s Cache API?

I was thinking one very big Zato cluster using a common cache storage, but if requests/responses to/from the cache have to travel through the cable, then a good portion of the benefits would/could be lost (still useful for remote, tough calls, of course).

Thanks.


#2

I cannot say much about memcached but I can tell you what Zato caches are about:

  • Each is in RAM only (will have persistent storage in the future at one point)
  • Each server maintains its own copy of the cache
  • Cache synchronization always happens in background, e.g. you set a key and other servers are notified of it asynchronously, the first server does not wait for anything

This means it was designed for scenarios where:

  • Utmost performance is required - data is in RAM of each server that receives a request and the functionality is written in Cython, hand-tuned
  • Data is very often read and very rarely changed
  • Under heavy write load, slight differences between cache states across servers can be accepted - e.g. if you have 100k req/s to the same cache and key then it will likely be one value in one server and another in another server because of the TCP traffic needed to synchronize them in background

To give you more perspective, I use it in projects where user permissions are kept in external databases and 10-20 or more permissions need to be checked on each request.

Consulting remote databases would take several seconds, and that only for permission checks, not including any business functionality. Instead, each such check is cached on first access. Afterwards, the checks take a fraction of a millisecond because they boil down to simple in-RAM dict lookups.

This works nicely because such permissions change maybe a few times a year for each user, so each time they are changed, their respective cache entries are deleted.

Apart from that, the vocabulary to access the cache entries is rather rich, i.e. it is not only get/set/expire. There is also the GUI in web-admin. I have also ideas on how to combine it in the future with SQLAlchemy to return objects from cache directly but that is something else - the core is above.