Literal Dev Diary - June 21st, 2020 ■ javamonn

This week I cut a release adding tagging support to highlights. What is currently implemented is basic tag management functionality. The next item on the roadmap is an interface for actually making use of the tags - seeing highlights within the same tag and visualizing relationships between tags.

In terms of the product manifesto and vision as stated on the landing page, tags are fundamental:

Contextualize and associate.

Ideas do not exist in isolation. Ideas are threads that span otherwise disparate contexts. Annotations organized strictly by their original source is an artificial limitation. Create bridges, not silos.

I want to touch up the visual design of the current tag management functionality, and I still have a couple of issues with the user experience as implemented, but I want to move on to getting some of the broader contextualization functionality in place in the coming weeks for now. On this front in the coming weeks I anticipate that I'll be moving into working more on design and evaluation rather than code itself.

Cleaning up orphaned Tags

Tags have a many-to-many relationship with highlights. When a tag is removed from a highlight, we want to remove it if it no longer associated with any highlights. From a product perspective, anticipating some of the future tag-centric functionality I mentioned above, "orphaned" tags should be excluded for obvious reasons.

As I mentioned last week , highlight and tag updates are transactional. The client issues a single Mutation describing the set of changes that it wants to apply to the highlight and associated tags, and AppSync performs a TransactionWriteItems operation to fulfill it. In the course of this update mutation, we may end up disassociating one or more tags from a highlight, causing those tags to potentially become orphaned unless we delete the tag as well.

In moving to implement this clean-up behavior, I found that there wasn't a way to accomplish this directly within AppSync. The problem is as follows:

AppSync has "linear" resolvers. An incoming GraphQL Mutation or Query is mapped to one of a set of DynamoDB operations (in this case at least ^[1]). There's also a pipeline resolver , which executes several individual "linear" resolvers chained in a series. Both of these resolver types are statically configured (via CloudFormation, in the case of Amplify).
In order to identify if a Tag is associated with any Highlights, we need to issue a Query against DynamoDB. If several Tags are removed from a Highlight in a single Mutation, we will need to issue more than one Query.
AppSync has a Query request mapping document, but no way to issue more than one. TransactGetItems and BatchGetItems , although they both allow bulk get operations, do not allow for issuing query operations. Additionally, because pipeline resolvers are configured ahead of time, we can't add stages for each of the removed tags we want to issue a query for.

AppSync seems to try very hard to keep resolvers performant and I feel like many of the constraints and apparent design shortcomings I've come across are direct results of this philosophy. In a way, AppSync feels like a logical evolution of "Serverless". Within AppSync, the developer is really only responsible for the business logic - you are no longer operating at the level of the HTTP request and response, or managing service connections, or even thinking about the run-time at all. ^[2] On of the trade-offs with an environment like this is that the pasture continues to get smaller and smaller - something that may be possible in other lower-level back-end environments is no longer possible within AppSync.

I ended up implemented the Tag cleanup behavior by subscribing a Lambda function to a DynamoDB stream, watching for remove operations on the appropriate association table, and executing some logic to check if the disassociated tag still had at least on association and if not remove it. You can see the Lambda function here. This is conceptually very similar to a trigger on a SQL table, and comes with a similar set of trade-offs.

GraphQL clients on the server

Literal has a couple of Lambda functions that interface with DynamoDB via GraphQL, instead of directly issuing operations with the AWS SDK. In general, I'm a strong advocate for this approach, for reasons including keeping logic centralized and data consistent among others. Surprisingly, the Amplify JS SDK does not support interfacing with the AppSync GraphQL API from Lambda. The documentation shows examples of issuing requests against AppSync with lower level Node HTTP libraries, and the non-support is mentioned in some issues opened against the repository.

I remember coming across a tweet (which I can no longer find) with a horror story of someone running Apollo client on a server, and debugging an issue they were observing down to Apollo's in-memory cache being utilized across connections, resulting in a combination of returning stale data as well as a potential security issue. I'm not sure if this at all affects Amplify's GraphQL client, but it's a highlight of the potential issues that can arise with porting a project designed for clients onto the server.

In Literal's case, building off of the docs, it was relatively simple to implement a client. I'll have to revisit this if server-side rendering becomes prioritized.

W3C Web Annotations

I came across the Web Annotation spec this week. I had come across the spec once before, when working on the previous incarnation of Literal Reader ^[3], but it had since fallen off of my radar.

In short, Literal is an annotation management system and it will implement the Web Annotation spec.

There are aspects of the spec that describe functionality or products outside of the current scope of the project, but regardless, building on a well thought-out and standardized data model is more than worth the work to back-track a bit and refactor the Literal data-model to be compliant, especially when the project is as early stage as it currently is. I'm aiming for close compatibility with the spec, with small changes to better support GraphQL (e.g. making the full object available in all cases, instead of just the ID) and support for DynamoDB index and query patterns.

I'll probably spend some time writing about the spec in more detail as I start implementing it within Literal.

Cleaning up orphaned Tags

GraphQL clients on the server

W3C Web Annotations

Highlights of the week