GraphQL Codegen deep-dive

Christian Budde Christensen
10 min readJun 7, 2021

--

This post is a deep dive into the graphql_codegen Dart-library, available on pub.dev/packages/graphql_codegen. The post will hopefully provide a better understanding of the ideas behind the code generator. It’ll focus on how different GraphQL language features are mapped into usable Dart types.

In a previous post, I’ve covered the motivations behind graphql_codegen and how it supports a pattern of splitting up your GraphQL operations throughout your app to prevent over-fetching and ensuring maintainability. While it was specifically focused on flutter apps, the principles can easily be applied to any dart app.

The pattern primarily relies on good serializers being generated from GraphQL operations (mutations, subscriptions, and queries); from these, we can pass data around our app in a type-safe way!

Take the following query:

query FetchPerson {
person { name }
}

The result of executing this operation might look something like

{ "person": {"name": "Lars Larsen"}} 

If we were to create a serializer to parse this, e.g., using json_serializable (pub.dev/packages/json_serializable), we could, ignoring necessary factories and constructors, write it as

@JsonSerializable
class QueryFetchPerson {
final QueryFetchPerson$person? person;
}
@JsonSerializable
class QueryFetchPerson$person {
final String name;
}

Now, notice how in this example, there's a one-to-one mapping between selection sets of the GraphQL query (stuff in {}) and the serializers. While this is a good initial intuition, the one-to-one mapping doesn’t suffice for all cases.

Take the query

query FetchPerson2 {
person { name }
person { age }
}

This operation will result in something like

{ "person": {"name": "Lars Larsen", "age": 71 } }

In this case, while we had three selection sets (three sets of {}) we’ll only need two serializers, as the example above. One with a person field and another with a name and age field.

This tells us that there’s no 1–1 mapping between selection sets and serializers, but rather a mapping between fields (with selection sets) and serializers. I.e., for each operation, we’ll have one root serializer, and for each unique field, with a selection set, we’ll have another serializer.

This is a great start! But we’re not quite there yet. One compelling GraphQL language feature is the concept of abstract types. These are interfaces and unions which can never be directly returned from a field, i.e., the __typename will never be an interface or union type, but instead relies on implementations or members, respectively, to be returned.

To utilize abstract types fully, GraphQL supports introspection through the __typename meta-field and spreads with type conditions. Focusing a moment on inline fragment spreads with type conditions, this looks something like

query FetchPerson3 {
person {
__typename
name
... on Author {
numBooks
}
}
}

and the result may look like:

{"person": {"name": "Karl Ove Knausgaard", "numBooks": "at least three"}}

if the person is an author, or

{"person": {"name": "Lars Larsen"}}

if he’s not.

How do we generate a serializer for this? An obvious approach is to make all fields in a spread optional, regardless of the nullability of the field type. This could look something like

@JsonSerializable
class QueryFetchPerson3$person {
String name;
String? numBooks;
}

This can work, but we’re missing some crucial information on the concrete type of the returned object. We’ll need to either assume or null check if we’re dealing with an Author or some other concrete implementation.

Instead, we can express this concept of resolved concrete types by creating the serializers:

@JsonSerializable
class QueryFetchPerson3$person {
String name;
factory QueryFetchPerson3$person.fromJson(json) {
switch(json['__typename']) {
case 'Author:
return QueryFetchPerson3$person$Author.fromJson(json);
default:
return _QueryFetchPerson3$person$FromJson(json);
}
}
}
class QueryFetchPerson3$person$Author extends
QueryFetchPerson3$person {
String numBooks
}

Again, I’ve omitted constructors, some factories, and other methods for simplicity.

This allows us to do a type check on our parsed data, e.g., parsed is QueryFetchPerson3$person$Author, and from then on we can access the numBooks field with good conscience.

To summarize, we have three types of serializers:

  1. An operation serializer for each operation, query, mutation, or subscription.
  2. A field serializer for each field with a selection set.
  3. A type serializer for each concrete type.

Figuring out when to create the type-serializer isn’t as easy as figuring out when to create the other serializers. We can’t just look directly at the query and create a serializer, we’ll need to take to current types of the field, and the type condition of the spread into account.

This is because fields can have abstract and concrete types, spreads can have abstract and concrete type conditions, and there’s no insurance that the type-condition will ever be true. So let’s say that we have the schema:

type Query { t: I1 }interface I1 { name: String }interface I2 { age: Int }interface I3 { isDogPerson: Boolean }type T1 implements I1 & I2 { 
name: String
age: Int
hasTwoHands: Boolean
}
type T2 implements I2 {
age: Int
}
type T3 implements I3 {
isDogPerson: Boolean
}

Given this, the following query is completely valid

query Q { 
t {
... on I1 { name }
... on I2 { age }
... on I3 { isDogPerson }
... on T1 { hasTwoHands }
}
}

When creating our serializers, we’ll need to figure out which types to create a type-serializer for. We can not create serializers for abstract types, interfaces I1, I2, and I3, this is because the field will always be a concrete type.

Making serializers for concrete types T2 and T3 does not make sense since they don’t implement I1 and hence can never be returned from the field t.

Furthermore, if a spread should always be expanded, we should add them directly on the parent serializer instead of creating a nested type serializer. This is the case for the ... on I1 spread which will always be true since the type of the field, t, is I1.

We can figure out which concrete types to create serializers for with the following algorithm:

Let TField and TCondition be the types of the field and the type condition, respectively. Then, let SConcrete<T> be the set of possible concrete types for a given type. E.g., the type itself, or all concrete types implementing this, or all type members if T is an object, interface, or union-type respectively.

  1. If there’s an exact match between the TField and TCondition, then we should inline the spread on the parent serializer. The spread is basically redundant.
  2. If the intersection between SConcrete<TField> and SConcrete<TCondition> is empty, we can safely ignore the spread since it’ll never happen. I.e., no matter which concrete type the field’ll resolve to, it’ll never be fulfilling the type condition.
  3. If the field type is a concrete type, we’ll inline the fragment. In this case, since the intersection is not empty, we know that the intersection is exactly the concrete type. Thus, with the same logic as in step 1., the spread is effectively redundant and we can inline it.
  4. Finally, we’re left with an non-empty intersection between the possible types of an guaranteed abstract field type and possible types of the type condition. For each type in the intersection between SConcrete<TField> and SConcrete<TCondition> we can now generate a type serializer.

While this example was done on inline fragments the algorithm for fragments doesn’t differ. For the sake of generating the serializers, we can treat the type condition of the fragment and the root selection set as type condition and the selection set of the inline fragment spread, respectively.

Just like the field serializers are merged for multiple selections on the same field, the type serializer must be merged on the concrete type of the serializer. We can not have multiple type serializers for the same type as children to another serializer. Think of the switch statement above, how would you resolve two different serializers with the same concrete type?

So to summarize, we have three types of serializers: Operation serializers, field serializers, and type serializers. We’ve gone through how and when to generate these serializers.

Fragments

A powerful, if not the most powerful, feature of graphql_codegen is the ability to map fragments to interfaces! I know Dart doesn’t have interfaces, but abstract classes. However, for the remainder of this post, when I mention interfaces think of abstract classes with no member implementations that are never extended, only implemented.

In a previous post, mentioned above, I made an argument about using fragments to split up your operations and pass data down to the children from a root query, thus preventing over-fetching and keeping your code nice, scalable, and maintainable.

Take the following query responsible for fetching a person:

query Print {
person {
...PrintPerson
siblings {
...PrintPerson
}
}
}

This will have some companion code:

printQuery(rawData) { 
final parsed = QueryPrint.fromJson(rawData);
printPerson(parsed.person);
parsed.person.siblings.map(printPerson);
}

This code uses a fragment that can be defined somewhere else and look something like:

fragment PrintPerson on Person { name }

and it’ll have similar accompanying code

printPerson(FragmentPrintPerson person) => print(person.name);

With the serializers we have build until now, the examples above are not possible. We do not have a FragmentPrintPerson type. If we wanted to parse the person to the printPerson method we’d either have to manually transform it or change the call signature of printPerson to use the serializers directly. The former option is cumbersome and against what we’re trying to achieve with code-generation. The latter option is tying the printPerson tightly to the query, thus limiting re-usability and removing any benefit from splitting up the operation.

To get around this we can generate an interface from the fragment and make any serializer spreading the fragment implement it! Extending the example from above, the fragment interface would look something like

abstract class FragmentPrintPerson {
String get name;
}

and the serializers:

class QueryPrint$person implements FragmentPrintPerson {
final String name;
final List<QueryPrint$person$siblings> siblings;
}
class QueryPrint$person$siblings implements FragmentPrintPerson {
final String name;
}

Now the code above will work!

In this example the fragment was quite simple. Noticably, it didn’t have any nested selections. If we introduce the fragment,

fragment PrintPersonWithSiblings on Person {
...PrintPerson
siblings { ...PrintPerson}
}

the corresponding code

void printPersonWithSiblings(
FragmentPrintPersonWithSiblings person,
) {
printPerson(person);
person.siblings.map(printPerson);
}

and use this fragment in our query

query Print { person { ...PrintPersonWithSiblings } }

the serializers might look like:

class QueryPrint$person implements 
FragmentPrintPersonAdvanced, FragmentPrintPerson {
final String name;
final List<QueryPrint$person$siblings> siblings;
}
class QueryPrint$person$siblings implements FragmentPrintPerson {
final String name;
}

What would the fragment interface look like? If we follow the algorithm from generating serializers, it would look something like:

abstract class FragmentPrintPersonAdvanced 
implements FragmentPrintPerson {
String get name;
List<FragmentPrintPersonAdvanced$siblings> get siblings;
}
abstract class FragmentPrintPersonAdvanced$siblings
implements FragmentPrintPerson {
String get name;
}

The extra interface FragmentPrintPersonAdvanced$siblings might seem redundant, but remember that fragments also need to support multiple fragment spreads, so we can not replace this with FragmentPrintPerson.

Revisiting our query serializer above, you may notice that this is invalid Dart code. The QueryPrint$person serializer implements FragmentPrintPersonAdvanced, but the siblings field is not a list of FragmentPrintPersonAdvanced$siblings.

To fix this, we’ll need to keep track of all child interfaces, generated from a fragment, e.g. FragmentPrintPersonAdvanced$siblings, and add this to the corresponding serializer fields. Thus, if we update our serializers above to something like:

class QueryPrint$person implements 
FragmentPrintPersonAdvanced, FragmentPrintPerson {
final String name;
final List<QueryPrint$person$siblings> siblings;
}
class QueryPrint$person$siblings
implements FragmentPrintPerson,
FragmentPrintPersonAdvanced$siblings {
final String name;
}

we’re in business! This holds for both field serializers and field type serializers.

Therefore, we’ll need to generate three types of interfaces.

  • Fragment interface: The root interface such as FragmentPrintPerson and FragmentPrintPersonAdvanced.
  • Field interface: Similar to the field-serializer we’ll need to create interfaces for each field. E.g., FragmentPrintPersonAdvanced$siblings.
  • Type interfaces: Not covered here, but with exactly the same algorithm for generating type serializers, we’ll need to create corresponding type interfaces.

The observant reader will notice some similarities between the interfaces and the serializers. Indeed, the interfaces follow the shape of the serializers and can be generated in the exactly the same way as the serializers. The’ll need to merge fields and types, and resolve concrete types for type interfaces.

Enums and inputs

Generating serializers for inputs and enums are more or less what you’d expect. Since these are type definitions, rather than executable definitions, we don’t have to deal with fragment spreads or abstract types. We’ll just create a serializer with the expected fields and corresponding types.

For example, from

input SiblingInput { name: String! } 
input PersonInput { name: String! siblings: [SiblingInput!]! }
enum Level { ADVANCED INTERMEDIATE BEGINNER }

we can generate the corresponding serializer

@JsonSerializable
class InputPersonInput {
final String name;
final List<InputSiblingInput> siblings
}
@JsonSerialiable
class InputSiblingInput {
final String name;
}

One small caveat on enums, though. While we can generate dart enums one to one with the GraphQL enums, it’s a good idea to design for adding of new enum values. If we do not account for these, the serialization will break once we modify the API.

This can easily be done by adding a fallback value to the enum. From the enum above, the generated serializer could look like:

enum EnumLevel {  @JsonValue("BEGINNER")
beginner
@JsonValue("INTERMEDIATE")
intermediate
@JsonValue("ADVANCED")
advanced
$unknown
}

Then any field using this this enum will need to be annotated with the JsonKey annotation:

@JsonKey(unknownEnumValue: EnumLevel.$unknown)
final Level level;

And that’s it!

Variables

Lastly, we’ll need to generate serializers for the variables. Much like the input serialization above, variables generally behaves nicely. This means that we can generate a VariablesQuerySomeQuery serializer for each operation defining variables.

Given the query

query SomeQuery($age: Int!) { ... }

the corresponding variables serializer would look like

@JsonSerializable
class VariablesSomeQuery {
final int age;
}

Summary

In this deep dive I’ve tried to layout the principles and ideas behind graphql_codegen. I’ve shown which three types of operation serializers we’re creating, how to handle fragments in a usable way, and how to generate serializers for input types, enum types, and variables.

If you want to learn more, I would suggest you to check out the graphql_codegen implementation at github.com/heftapp/graphql_codegen. PRs and feedback are always highly appreciated!

Acknowledgements

The graphql_codegen is heavily inspired by Artemis. A great codegenerator and client that differ in their approach to fragments and spreads, but otherwise follows much of the same ideas wrt. serializers.

--

--