Posted on Feb 5, 2022

Using GraphQL as an anti-corruption layer in front of legacy services

In 2018 I presented this topic at Boston Code Camp. At that time we were beginning our journey of using GraphQL at EF Go Ahead Tours. In this post I'll go over what our early steps looked like and what the outcome was for our teams.

The opportunity

Many industries depend on legacy systems that businesses are unwilling or unable to invest in improving or replacing. The risk and cost of replacing those systems can be prohibitive, especially in the short term. Yet, the existence of these systems pose a risk to growth by reducing engineering velocity and stifling innovation.

There are many reasons your legacy systems may no longer be sustainable:

Too complex. They lack tests, a clean scalable architecture, or use outdated or unmaintained tools. They use business rules that are not documented and not even relevant.
Too big to fail. These services are often monoliths. They have many interconnected parts making it hard to focus on improving piece by piece since the impact radius of a single change might be hard to predict.
Not worth the time. If it ain't broke, don't fix it, right? The consequences of looming tech debt have not reached its apex yet.

The impact of overwhelming legacy and the tech debt often associated with the systems can loom large over your team. It can have a negative impact on team culture as people feel they're working on outdated systems that lack an end goal or an opportunity to innovate. This lack of innovation can lead to the following:

Loss of passion. It can be unrewarding to work on the system that everyone has problems with or the problems seem endless.
Loss of business. How can we pivot to meet an evolving market?
Loss of trust. Slow progress can cause internal stakeholders and external customers to lose faith in the product.

In a risk reward assessment you'll often find that replacing these systems all at once incurs too much cost or poses too much risk. As a result, you'll need a plan to retire your legacy over time. The goals, methods, and processes of doing so is worth an separate blog post but at EF Go Ahead Tours we were looking to do the following:

Redefine our domain based on changes to the business since the systems inception
Improve documentation and ease of use for consumers of the system
Reduce the number of dependencies across domains or systems
Provide a way to better extend domains or systems with new functionality that may have not been previously possible
Allow for the introduction of new technologies such as new frameworks, languages, databases, etc
Improve developer experience and reduce toil

And of course, we are looking to create new opportunities that drive value for customers.

Why GraphQL?

Contracts between systems are important. These contracts all serve the purpose of making inter-system communication predictable and trackable. In our architecture today contracts get represented by one of three things:

Events on a shared event bus for async communication between services
gRPC + protobufs for the rare need of synchronous communication between services
GraphQL Schema Definition Language for communication between clients and services

With contracts, you can separate the implementation detail from the functionality of a given service. This is an important tool in building an anti corruption layer as your first step is often to redefine your domain. It also allows systems to be modular and capable of change while having a predictable impact radius on its dependents.

GraphQL allows you to take disparate data sets and represent them as a cohesive set of models with defined relations. Consider the following example:

Diagram showing the example described below — Diagram showing the example below

You have a CMS that contains content related to your products:

1{
2    "id": 12345,
3    "name": "Example Product",
4    "description": "This is an example product",
5    "images": [
6        { "url": "https://assets.mystore.com/images/example_product_1.jpg", "alt": "Example product with a colorful background" },
7        { "url": "https://assets.mystore.com/images/example_product_2.jpg", "alt": "Example product sitting on a shelf" },
8        { "url": "https://assets.mystore.com/images/example_product_1.jpg", "alt": "Materials that example product is built with" }
9    ],
10    "tags": ["New", "Top Seller"],
11    "collections": ["Examples"]
12}

And an inventory service that stores pricing and inventory data:

1{
2    "id": 12345,
3    "inventory": 100,
4    "available": 56,
5    "price": {
6        "USD": 100,
7        "CAD": 120
8    },
9    "discounts": [
10        {
11            "code": "HALFOFF",
12            "amount": {
13                "USD": 50,
14                "CAD": 60
15            }
16        }
17    ]
18}

And a review service that stores review information:

1[
2    {
3        "id": 1,
4        "productId": 12345,
5        "userId": 121212,
6        "title": "This product is amazing",
7        "description": "This product is incredible!",
8        "rating": 5
9    },
10    {
11        "id": 2,
12        "productId": 12345,
13        "userId": 313131,
14        "title": "This product is only ok",
15        "description": "This product is just ok.",
16        "rating": 3
17    }
18]

What your clients need is a way to look at a common set of models from multiple different entry points. So your remodeled object in your GraphQL SDL layer could look like the following:

1type Query {
2    product(id: Int!): Product
3    review(id: Int!): ProductReview
4}
5type Product {
6    id: Int!
7    name: String
8    description: String
9    images: [ProductImage!]
10    tags: [String!]
11    collections: [String!]
12    inventory: Int!
13    available: Int!
14    price: ProductPrice!
15    discounts: [ProductDiscounts!]
16    avgRating: Int!
17    reviews: [ProductReview!]
18}
19
20type ProductImage {
21    url: String!
22    alt: String
23}
24
25type ProductPrice {
26    USD: Int!
27    CAD: Int!
28}
29
30type ProductDiscount {
31    code: String!
32    amount: ProductPrice!
33}
34
35type ProductReview {
36    id: Int!
37    productId: Int!
38    userId: Int!
39    title: String
40    description: String
41    rating: Int!
42    product: Product
43}

Your clients create their own queries to the graph depending on their use case without having to care about where the data is originating:

They want to query a product and get all its reviews.

1query FindProduct($id: Int) {
2    product(id: $id) {
3        id
4        name
5        avgRating
6        reviews {
7            title
8            description
9            rating
10        }
11    }
12}

Or it's a review detail screen and they want to show some associated product information:

1query FindReview($id: Int) {
2    review(id: $id) {
3        id
4        rating
5        title
6        description
7        product {
8            name
9            description
10            avgRating
11            images {
12                url
13                alt
14            }
15        }
16    }
17}

In either scenario, the client is able to fetch the data it needs because the SDL acts as a contract that the GraphQL layer promises to fulfill.

In GraphQL the schema is enforced by resolvers — a function that handles resolving the value of that field. Every field is, including queries and mutations, represented by a resolver. A resolvers signature consists of the following:

The parent object from a top level resolution
Any arguments passed that are represented in the schema
A context objected often constructed per request to contain claims, attached loaders/services, etc
An info object that provides metadata about the request

Resolvers allow you to fulfill the SDL requirements of a type from many many sources. For example, the legacy CMS, inventory, and review services we outlined above.

1export const resolvers: Resolvers = {
2    Query: {
3        review(_, args, context) {
4            return context.reviewLoader.load(args.id);
5        },
6        product(_, args, context) {
7            return context.productLoader.load(args.id);
8        },
9    },
10    Product: {
11        reviews(product, _, context) {
12            return context.reviewLoader.loadByProduct(product.id);
13        },
14        avgRating(product, _, context) {
15            return context.reviewLoader.loadAvgByProduct(product.id);
16        },
17        async inventory(product, _, context) {
18            const result = await context.inventoryLoader.loadByProduct(product.id);
19            return result.inventory;
20        },
21        async available(product, _, context) {
22            const result = await context.inventoryLoader.loadByProduct(product.id);
23            return result.available;
24        },
25        async price(product, _, context) {
26            const result = await context.inventoryLoader.loadByProduct(product.id);
27            return result.price;
28        },
29        async discounts(product, _, context) {
30            const result = await context.inventoryLoader.loadByProduct(product.id);
31            return result.discounts;
32        }
33    },
34    Reviews: {
35        product(review, _, context) {
36            return context.productLoader.load(review.productId);
37        }
38    }
39}

In the example code above you can imagine the loaders are ways to fetch data from the backing systems. These calls could be over any number of protocols such as gRPC, HTTP, SOAP, etc. The idea is to have your model fulfilled from many backing services. Picture how easy it would be to add new functionality relevant to our product models. You would need a new field to the schema and a new resolver that calls to your new datasource.

Where to start?

Getting started can be intimidating. Rome wasn't built in a day and neither should your graph. This is an opportunity to rethink your domain and establish a ubiquitous language. Start there. Work with product, senior engineers, and stakeholders to start building a dictionary. Define what your domain should look like. Then take a piece of it, represent it in your graph, and build from there.

If your team has the tendency to build BFFEs or works in isolation you'll need to get buy-in and collaborate. You want to avoid the graph becoming another place where the same objects get defined in many ways.

At EF Go Ahead Tours we have a staff engineer who elected as a graph champion. Part of their job is to help teams understand how they can benefit from the new system. They also help run trainings, provided architectural guidance, and general oversight of the graph.

Principled GraphQL is a great resource from Apollo to guide you on best practices.

Outcomes

At EF Go Ahead Tours we used this method and were able to achieve the following:

Remodeled our domain and established a ubiquitous language the with business
Improved documentation and data democratization for our engineers
Consolidated many BFFEs into a single graph used by many websites, mobile apps, PDF generators, and more
Provided a way to expand functionality backed by legacy with new services improving velocity of new development
Allowed for the introduction of new programming languages, frameworks, and databases
Improved developer experience by reducing toil and reducing the number of bugs

Since taking this initial approach, new technology such as Apollo Federation has emerged. This helped us continue to break apart our legacy systems over the last 3 years while innovating. In a follow up blog post, I will go over how we expanded this approach using a series of GraphQL microservices.

What's next?

GraphQL adoption at EF continues to increase. In the current phase of our journey, we are working on ways to integrate many businesses through the graph. This has posed unique challenges around performance, domain modeling, and more. We'll be writing blog posts on these topics as we go. So stay tuned!