Daniel Bushman
profiles ::

Reactive cache invalidation

Two Hard Things

There are only two hard things in Computer Science: cache invalidation and naming things.

Phil Karlton

 

It’s a well-worn saying (see Martin Fowler’s TwoHardThings), but the saying is overused for a reason: it immediately rings true for many programmers. I personally love naming things, and believe it’s worth the bit of extra time it takes to do it well. But this post is about an architecture I invented independently in 2011, which I believe to be somewhat novel. It uses a type of dependency inversion that loosely resembles a reactive pattern. The name I originally gave it was Object Graph Modelling.

This “invention” of mine continues to haunt me because the code was proprietary to FamilyLink.com, so I couldn’t take it with me when our company was bought by Ancestry and I moved on. I haven’t had a chance to take another stab at it in the meantime. But, I’ve stewed over it a lot since then because I’m convinced it was ahead of its time in some ways. We were dealing with 60 million users. I was thinking outside the box for designing an architecture to handle it.

When a user visited the Facebook app, a single API request returned a document which included the complete state of the app for them, including denormalized data cached about other users, most notably their possible relatives. It was clever in the way in which this cache was invalidated and kept up to date.

“Let me explain . . . No, there is too much. Let me sum up.”

Inigo Montoya

Object Graph Modelling

I named it Object Graph Modelling (OGM) due to its similarity to Object Relation Mapping (ORM). The documents were stored as MySQL blobs, indexed by primary key, with a homemade sharding setup which did the job of being very fast, with sufficient redundancy.

The models were written in PHP but described using a domain specific language (DSL) which was hypothetically portable. I learned about ways to use graphs from patterns I saw in Facebook’s social graph which was then somewhat newly emerging to the public. I felt that it was a fitting paradigm for our data about people’s relatives, for much the same reasons why Facebook chose a graph for data about social relationships. The primary fields were updated directly by applying a delta. Secondary fields about other people, in this case, relatives, were selectively denormalized copies of the primary fields for those people. In the model, those fields were expressed in a DSL for describing the graph relationships of where the data comes from. When the query DSL was interpreted, it formed reactive dependancy. Every person document in our system created subscriptions to specific fields in the documents for each of their relatives and possible relatives.

When a delta to a document came in, the model sent out a message which was being watched for to update any document that would be affected, according to a set priority. The documents would all then be updated to be eventually consistent.

Imagine this pseudo-model:

Person = Node({ 
  attributes: {
    name: string,
    birthday: date
  },
  edges: { 
    jobs: {
      company: .name, 
      position: string
    }
  }
})

This is describing a Person object with 2 attributes. A person also has a graph edge called jobs, and for each job at that edge, we want to store 2 attributes. The first, .name, starts with a dot which in this pseudo-model indicates that the rest of the value is a reactive query, one might say, on the nodes along that edge. We are indicating that we simply care about “name, and we’ll call it “company” in our model, and care about no other details of the company. And it also indicates that we want to subscribe to the company’s name so that the field will be updated in our denormalized copy when the company name changes. This pseudo-model also specifies a field labeled position which is an attribute of the edge — of the person’s relationship to the company  — not an attribute of the company itself.

In the database, we will keep the ids of each company (just in case), keep a denormalized copy of the company’s name, and store the position as a string.

Now, if a company changes its name, and it has 5,000 employees, then based on our model 5,000 documents will need to be updated. If a company changes some attribute other than name, none of the Person documents need to be updated. If they get a new CEO, for example, then no one cares, or rather, no one is subscribed to the change. This is a pattern that specifically works when reading data needs to be blazing fast, but it’s okay for writing data to lag a bit behind — thus, eventual consistency. We are able to read super fast and inexpensively, and so as a site visitor browses around, for example, they are causing very little stress on the system. It is only when they enter something in, by adding something or making a change, that a propagation is required. But for a great many data sets, there is a net efficiency gain.

It’s fundamental to recognize that this isn’t simply a means of pre-warming caches, but a way of describing when and how specific updates to the caches are required.

When a delta comes in — a change to the data — the change is broadcasted in a way that any model that is subscribed can further evaluate the change’s various implications to the documents which represent instances of that model.

Also, fundamental to this approach is the way it helps the long-term maintainability of the code through the way it model dependency. In this example case, the company model does not need to know or care, one might say, about who may or may not be subscribed to changes of its name. It can just focus on being the best company it can be.

Reactive pattern:

[Company]-->  [Person]

This article is unfinished and I don’t feel it explains things well. I will rewrite it. I also want to write supporting articles to explain things like reactive patterns to avoid the need to try and explain everything in one article. Whether or not this article was helpful to the reader, writing it allowed me to reload into my brain many details about the work I did and clarify my thinking about it, especially as it relates to what I’ve learned since about reactive programming and language interpretation, all of which will help me as I write a new implementation. 

Leave a Reply