Automatic Product Analytics


Analytics within early stage products can present a complex cost-benefit analysis.

The benefits of an analytics implementation are numerous and already written about at length. In short, Although not a substitute for direct user feedback or decisive vision, analytics can inform the most valuable next steps to take with a product in identifying what and how features are being used and where "drop off" occurs. Additionally, analytics can serve as a rudimentary "sanity check" for core user flows in absense of sophisticated QA processes or integration tests [1].

The costs of analytics implementatons are less widely acknoledged. Implementations tend to be underfined, fragile, and iterative. Business owners understandably approach the problem at a high level, while engineering is often left on its own and derive dozen of individual events with specific properties. If the product is later stage, there might be a product manager in the middle that can successfully extract or otherwise formulate the scope of what should be tracked. In either case, the initial implementation will have to change in response to product change or recalibration in what or how user behavior should betracked. Maintenence is error-prone, as events requiring specific data must be triggered at specific times, while remaining historically consistent as the code base and product changes around the implementation.

I've repeatedly encountered issues of the kind detailed above when building and maintaining analytics implementations within early stage products. Instead of manually defining and reporting events, what if we could automatically derive a tracking scheme from the business logic and implementation details of the application itself?

GraphQL Autotrack


"Autotrack" analytics implementations are not a new idea. The most widespread approach is to derive events from user interactions with the application interface. The problems with an autotrack implementation of this type include reliability, data integrity, and security concerns [2]. Instead of driving "autotrack" logic at the user interface level, we can leverage GraphQL, a structured API query framework used within modern web applications, in order to drive "autotrack" at the level of core application logic.

In context of analytics, leveraging the existing GraphQL implementation within a codebase has several advantages. GraphQL is declarative. User actions are modeled as independent operations that closely match how a user is interacting with your application. For example, when a user views their homescreen, the application will trigger something like a Query HomeScreen_v1. When the user takes an action that updates some state, the application will trigger something like a Mutation CreateNote_v1. Capturing, transforming, and reporting these operations as analytics events appoximates a comprehensive tracking plan with little effort. By extending your applications core network layer logic, analytics events are always synchronized with product functionality.

Nothing in this world is a "silver bullet", however. Not all user actions are captured as GraphQL operations, e.g. input focus events. This means you will have to either manually track these exceptions, or introduce a complementary UI-based autotrack approach.

Implementation


At its core, the implementation today is a simple Apollo middleware component that looks like the following:

let link = ApolloLink.make((operation, forward) => {
  forward(operation)
  ->ApolloLink.Observable.map(data => {
      onOperation(operation, data);
      data;
    })
  ->Js.Option.some
})

This is combined with logic in the consumer to optionally transform the GraphQL operations before delivering them to your product analytics provider of choice.

The approach has worked well within the couple of projects I've integrated it into, and I intend to add more functionality to it over the coming months. In particular, I'd like to add support for a client-side only resolver[3] for a Mutation CreateEvent operation, with the goal being to enable manual instrumentation through a unified interface for events that you want to track that are not otherwise coupled to GraphQL operations.