Pure Danger Tech


navigation
home

Terracotta Use Cases

16 Nov 2009

Yesterday, I posted a note about what Terracotta has been up to in the last 6 months. Casper Bang commented:

Unlike Ehcache and Hibernate, Terracotta as a magic problem solving layer is somewhat harder to grasp. After having read a bunch of your posts, I am still not 100% sure what Terracotta is or how much is to gain from it. On top of that, an Oracle DBA will claim that Oracle already does the best job at caching.

Have you guys thought of putting together some (simple but concrete) cases for would-be customers to study?

I started answering this in a comment, but I had too much to say. First, on “Terracotta as magic problem solving layer”… At it’s heart, Terracotta is clustered objects and clustered locks using pure Java + external config. You’re absolutely right that that’s pure technology and platform, not a solution to a problem. That technology tends to be useful when you want to stay in your problem domain, assume coherency of your objects with respect to normal Java locking constructs, and get scaling and high availability. We have a bunch of customers that have rolled really amazing solutions to complex problems with those basic tools.

At Terracotta we see our mission as making scaling your Java application simple (*cough* maybe other JVM langs too). We recognize that many people don’t want powerful low-level building blocks. Rather, they have an application that they are trying to scale and when they do there are certain problems that they typically encounter. Our current product line is designed to be a set of drop-in solutions that use the underlying Terracotta technology to address those problem areas.

Currently, the products we are focusing on are:

  • Clustered HTTP sessions – drop-in support for clustering your HTTP sessions on many popular open-source and commercial containers. Generally no code changes required.
  • Clustered cache – take an application using Ehcache and extend that into a cluster, again with no code changes, just some configuration modifications.
  • Clustered Hibernate second level cache – plug in Terracotta for Hibernate as a second level cache provider and get a coherent cache across your cluster. No coding changes – just configure Hibernate and the cache.

Lots of needs fall outside those products. For many of them, we already have custom solutions using Terracotta and we’re happy to help you through those cases as well. We have also identified a handful of pain points that we think are common enough that they deserve their own product, and you might see more showing up in the near future (hint hint).

With regard to “(simple but concrete) cases”, we actually have spent a lot of time on this but I guess there’s always room for more talking about it. About a year ago, we built a reference application called the Examinator, which was an online exam-taking application. It was a good illustration of a few different use cases rolled into a real application (that we perf tested and scaled). Some examples in the context of Examinator:

Exam sessions – this is a canonical example of what we call “conversation” data. It happens all the time in user-facing applications where a user will log on, build up some contextual state over a series of pages, and finally complete the overall work. In this case, the “work” is taking an exam. While the user is taking the exam, they are answering questions, flagging them for later review, etc. All of that data would (in an ordinary web application) get needlessly stored in a database on every page change.

By clustering the exam session information, you still retain the availability of the exam data (any node in the cluster sees a consistent view of it), but without any of the intermediate reads and writes to the database. Instead, you just wait till the end and drop the clustered data back into the database through Hibernate.

Exam test data – the exam sections, questions, choices etc that are available to take are effectively read-only. They can be modified but that’s a rare occurrence. This data is long-lived and stored in the database, and loaded by Hibernate into domain model entities. However, there’s no need to keep hitting the database for it over and over. You really want to load that data once, and cache it forever.

We have used different versions of the exam data cache but the current implementation uses Hibernate second level cache to store the exam data. Because the cache is clustered, only one node needs to load it once and thereafter it is stored in Terracotta-backed Hibernate cache.

User registration codes – as with many web applications, when you register you are sent an email with a registration code so you can confirm your email address. The registration codes are a perfect example of transient data – they need to stick around for a few hours or days, and should be available enough that you don’t have to hit the same server again to verify the code. But you also shouldn’t need to store it in the database at all as it’s purely temporary. Terracotta is a perfect match for that.

Besides these Examinator use cases, another common scenario where Terracotta-backed caches are used is in an inventory use cases. In an inventory use case, you typically have a medium sized data set of items with a fairly hot subset. That hot subset is undergoing a high rate of change. Keeping a cache in sync with a database requires a read/write cache. Having a coherent cache is really important to make sure that you see consistent data between all parties looking at the data (buyers, sellers, etc) as money and contracts are dependent on that data. If you’re looking for what our customers are doing with Terracotta, this list might give you some ideas.

With regard to “an Oracle DBA will claim that Oracle already does the best job at caching”, I don’t think you should take any statement like this as Truth (whether it’s from an Oracle DBA or me). You should build your own test and find out for yourself. There are also many dimensions of “best” – performance, price, maintainability, etc.

Terracotta is open source, free to try, and free to use for many needs. We provide licenses for 24×7 support, indemnification, patch delivery as well as some enterprise level products for extreme scale and better monitoring.