The MOIST Principle for GraphQL Schema Design
Sorry for the title, I had to.
I've recently seen a slight push for DRY (Don't Repeat Yourself) and the importance of sharing types as much as possible when building GraphQL schemas. This is something I hear a lot, and I've even seen some linting rules being created to enforce this principle.
I have to admit my experience and mistakes have made me a WET (Write Everything Twice) advocate when it comes to schema design. In this post, I'll explain why that's the case, and why it might be important to gain the full benefits of GraphQL.
WET seems to lack nuance; there are certainly parts of the schema that should share types when it makes sense. Instead, I present to you the MOIST principle: Mitigate the Overuse of Illusory Shared Types.
APIs are Hard to Change
At first glance, sharing types as much as possible seems like a great idea in GraphQL:
- Consistency for free across different entry points.
- Collaboration is almost "forced."
- Easy to maintain, changing things only in one place.
These are all great points to consider. They're undeniable.
The WET vs. DRY battle is not new and is more than API schemas. There's one key difference between API schemas and your typical statically typed programming languages that I think should make everyone err on the side of caution when it comes to sharing and reusing types: How easy it is to change after the fact.
Changing API schemas is usually much harder than refactoring a program. The network boundary and how decoupled callee and caller are means that changing types requires long deprecation cycles, usage analysis, reaching out, running brownouts, keeping backward compatibility, etc. Anyone who's managed a large API surface knows how arduous changing an API that is currently being used is.
All that to say, when it comes to network APIs, mistakes about what should be shared/not are much more costly, so I like to default to writing twice rather than sharing.
Identical Type Shape != Same Type
One particularly easy mistake to make early on in API development is to assume that because two response "shapes" are identical, they should share a single type. The most infamous example of this is probably User
types. I can't tell you how many times I've seen teams having to roll back changes/deprecate their initial structure due to sharing a User
type all over the place.
For example, it used to be common to have a viewer
root field in GraphQL. This would represent the currently logged-in user. Then, you may have an Organization
type that has a list of Team
, and Team
with a list of User
. You may also have a User
type used when looking at a profile.
At first, all these users may look the exact same:
type User {
name: String!
profilePictureUrl: URL!
}
After a while, most have the unfortunate realization that these were, in fact, three distinct concepts of the domain. The viewer
field starts wanting contextual fields about the viewer, which make no sense in other contexts. The team users start having contextual team-specific information, and a profile has data that can only be seen by the logged-in user. Our preferred schema would have looked like this:
type Viewer {
name: String!
profilePictureUrl: URL!
hasUnreadNotifications: Boolean!
}
type UserProfile {
name: String!
profilePictureUrl: URL!
paymentPlan: PaymentPlan!
}
type TeamMember {
name: String!
profilePictureUrl: URL!
memberSince: Date!
}
(I'm deliberately skipping the fact team members should probably use an edge/node pattern, but the principle remains the same)
Keen readers might be bothered by the repetition of name
and profilePictureUrl
. I'm not bothered by this at all when it comes to a few fields. Fields are relatively cheap in GraphQL, and the maintenance cost can always be reduced by typical code
techniques rather than simplifying the schema. But if it starts bothering us too much, I suggest using composition instead:
type Viewer {
userData: UserData!
hasUnreadNotifications: Boolean!
}
type UserProfile {
userData: UserData!
paymentPlan: PaymentPlan!
}
Strange authorization problems are a common symptom of this problem. If the profile of the logged-in user shared a User
, and only the logged-in user could see the paymentPlan
field, this authorization logic now becomes a runtime problem. Clients using the API have no idea in which contexts they can select paymentPlan
. GraphQL being more static/declarative, this is incredibly annoying. This forces us to make paymentPlan
nullable, with a description explaining that context. Using a specific type like UserProfile
can mean a stronger schema, and we can guard authorization at places that make more sense, like on a viewer
field, for example.
Can We Be BFFs?
A common use of GraphQL is to allow a multitude of clients to be served with more specificity than with a "One-Size-Fits-All" API. Many (including me) have compared it to the Backend For Frontends pattern.
Although they share many similarities and goals, a single GraphQL server can't ever quite replace the full flexibility of a BFF architecture. BFFs allow for completely different behaviors per client, different serialization/API styles even. This comes at a cost, like consistency, for example, which GraphQL can definitely help with.
My current thinking is that it's best to see GraphQL as an approach somewhere in between a BFF architecture and an OSFA approach to gain most benefits. If you're OK with a low cardinality of server-driven use-cases, a well-designed REST/endpoint-based API with an OpenAPI specification is pretty great. If you require absolute flexibility and client independence, it's hard to argue against a BFF approach. GraphQL is for those of us who are attempting to stay in that sweet spot, powering a bit of flexibility per client, while maintaining some sort of server-driven consistency and modeling of our domain.
I swear I'm going somewhere with this. It's very easy to turn GraphQL into something that resembles a One-Size-Fits-All API. Sharing types across the board and attempting to reduce all use cases to "one true type/fields" can quickly do that. To harness most of the power of GraphQL, I believe we need to find that sweet spot, which may mean allowing certain client-specific fields, different ways of representing use-cases, variability in types, etc. Never forget that GraphQL helps us do this without adding any overhead to other clients; we should be using that power.
An OSFA GraphQL API still provides the capability of selecting subsets of the API surface, which may be enough for simple client differences like a mobile phone displaying the same, but less data than a desktop client. In practice, differences might go further than a subset/superset relationship and require different fields altogether. We should not fear this but rather embrace it and be thankful GraphQL helps us manage this complexity.
Stay MOIST
I hope you don't take away from this post that I never recommend looking for opportunities to share types/logic. Far from the case. However, I do think we shouldn't make that an explicit goal. Chances are new types and fields are cheaper than you think, and your clients will thank you for them.
Thanks for reading!