Datomic test/live split

2014 July 23 Kevin J. Lynagh

I’ve been building a Datomic-backed API service, and last week I ran into a thorny problem. The problem itself is conceptually simple and can be tackled with a variety of solutions. Since you might run into it too, I thought I’d discuss potential solutions and their tradeoffs.

(The discussion is an architectural one: You don’t need an in-depth understanding of Clojure or Datomic to get the gist.)

The service is a JSON API that effectively adds certain features to Stripe (the awesome payments service). Just like Stripe’s API, our API requires that all requests include an HTTP authorization header containing a secret key. This secret key is used by the server to associate each request with a user.

Technical side note

The server is written in Clojure, using Ring+Compojure for the web stuff and Datomic as the data store. When a request comes in, ring middleware adds the user entity into the request map for the convenience of downstream handlers. That middleware looks roughly like this (sans logic for rejecting requests that don’t have a valid secret key):

(fn [handler]
  (fn [req]
    (let [key  (util/http-basic-username req)
          user (db/entity db [:customer/secret-key key])]
      (handler (assoc req :user user)))))

Notice that [:customer/secret-key key] serves as a Datomic lookup ref. So far, so good: SSL certs have been setup, staging servers are running, and integration tests are passing.

Enter problem

More precisely, enter new requirement: We want our API to match Stripe’s API, with separate “test” and “live” modes. The difference between modes is the allowed side effects: In “test mode” we don’t want to send out emails, charge credit cards, &c. Providing a separate “test” mode is a good user experience because it’s a safe environment for our customers to develop against and experiment within.

Note the following:

both modes are effectively “production”, since our customers will rely on them
the service must use the right keys when calling Stripe’s API on behalf of our customers (e.g., when our API is in “test mode” it should call Stripe with the customer’s “test mode” key and vice versa)
it’d be great to match Stripe’s user experience of providing separate “test mode” and “live mode” keys

How would you solve this problem? The current application works fine, so it’d be great to avoid extensive schema modifications or touching the many existing business logic functions and Compojure routes. Go ahead, grab some coffee and hop into your hammock to think about it for a minute.

Back? Okay. Here are the solutions I came up with:

Solution 1: Fully separate the test and live environments

Distinct hostnames (e.g., api-test.example.com and api-live.example.com) that would resolve to separate webservers backed by separate Datomic instances. Any server routes that differ between test and live mode would be modified to respond according to a global parameter.

Upsides:

server code would not need extensive modifications to support mode differences on the level of individual requests
very unlikely that information would leak between test and live environments

Downsides:

twice as many servers, which means doubling hosting costs, deploy/monitoring/backup costs, &c.
strict separation is a poor user experience: Customers would need to sign up separately and maintain the same “account plan” and third-party API key pairs across both environments
switching client code between environments could be awkward, as both the API endpoint hostname and secret keys would need to change

This was the most “obvious” solution, and the first I came up with. However, I felt I could do better so I kept thinking.

Solution 2: Same environment, with test/live mode predicates

Use the same hostnames and servers for test and live modes, and have requests indicate their mode.

Upsides:

no need for a duplicate environment
users only need to sign up once

Downsides:

most database objects would need a new :mode attribute
customer entity would need additional attributes for :customer/test-secret-key and :customer/stripe-api-test-key.
queries and other functions need to be modified to either accept a new mode argument (tricky with deeply nested functions) or rely on a *mode* dynamically scoped var (gross!)
subtle mess ups of the above could leak information (e.g., accidentally writing (count items) instead of (count (filter %(= :live (:mode %)) items)))
unable to permanently wipe test data to save space, as it’s interleaved with production data in Datomic

This seemed like a lot of work to implement and difficult to maintain through potential future features. I spent a few more days stewing on the problem before stumbling on a clean solution.

Solution 3: Same environment, with implicit user-level modes.

This solution leverages the following properties:

Datomic entities can have any subset of attributes defined in the database schema
the service strictly isolates users (the system does not model any relationships between users)

In the existing system, user entities in Datomic look like this:

{:db/id                    0000
 :customer/email           "foo@example.com"
 :customer/password        "<hash hash hash>"
 :customer/stripe-api-key  "stripe-live-key"
 :customer/secret-key      "abcd"}

The system queries users either by secret-key (for API calls) or by email+password (to login to a web dashboard). All API side effects are associated with the user either via Datomic entity (within our system) or via third party API key (when we call the third party API on behalf of our user).

The solution is to update the user entities to look like this:

{:db/id                0000
 :customer/email       "foo@example.com"
 :customer/password    "<hash hash hash>"
 :customer/stripe-key  "stripe-live-key"
 :customer/secret-key  "live-key"
 :customer/test-twin   {:db/id                0001
                        :customer/stripe-key  "stripe-test-key"
                        :customer/secret-key  "test-key"}}

Essentially, what we’re doing is creating a “twin” user that can be used for testing. Since the system already completely isolates users from one another, an isolated “test mode” falls out for free.

Upsides:

minimal DB schema changes: only a single new database attribute, :customer/test-twin (a reference), is added
minimal code changes: only the “new user” function must be updated to create the “test twin” entity; consider all of the changes this solution does not require:
- web logins: the test twin does not have an email or password, so no one can login as that entity
- Stripe API calls: all these functions need is something with a :customer/stripe-key attribute, and this is what they get (without knowing anything about the test/live split)
- DB functions: all these functions are written against a Datomic user entity, and they continue to work in the same way; again, no need to know anything about test/live split

Downsides:

like solution #2, we are using the same Datomic DB between test and live environments, so test data cannot be permanently cleared to reclaim storage

Outcome

We implemented the latter solution, and thus far things have been going swimmingly.

Note that the primary benefits of this solution stems from keeping things implicit: Very little code needs to know about the difference between “test” and “live” modes. Some might consider this more of a drawback; after all, the only difference between the “real” user entities and the “test” user entities is that the former have more attributes (e.g., email and password).

Ultimately, the entities within a Datomic database can be more subtle than the rows of, e.g., the Users and TestUsers tables of a SQL database. Whether or not that’s a good thing probably depends on how you feel about dynamic language hash-maps vs. Java-esque classes-as-data-types.

An earlier version of this article was published on the Keming Labs mailing list 2014 July 14. Join here if you’d like to hear more about using Datomic in anger: