Posted on 2022-06-17

Pulumi framework allows us to define cloud infrastructure as code with popular programming languages and runtimes such as Node.js, Python, Go, .Net, and Java.

Typically, an infrastructure project may begin its life as a monolithic Pulumi program which is deployed as a single unit or a Stack. As a project evolves, it might be necessary to split it into multiple Pulumi projects and stacks. This kind of project structure is generally referred as Micro-Stacks in the Pulumi universe. A Pulumi project might be broken into multiple smaller ones, for example, due to organisational factors such as two teams taking ownership over core and service infrastructure resources respectively, or to facilitate iterative development (developers may step on each other's toes while iterating on the same Pulumi stack).

In a Pulumi project with multiple stacks, we can use a stack reference to retrieve the outputs of another stack. This is commonly the case when a resource in a stack depends on a resource managed by another stack:

// "networking" stack
import * as awsx from "@pulumi/awsx";

const vpc = new awsx.ec2.Vpc("my-vpc", {
  cidrBlock: `10.10.0.0/16`,
  numberOfAvailabilityZones: 1
});

export const vpcId = vpc.vpcId;
...

// another stack
const networking = new pulumi.StackReference("networking");
const vpcId = networking.requireOutput("vpcId");
...

As soon as we introduce such a dependency relationship between stacks, there arise the issue of deploying them in the right order, and the risk of stale references and outputs. Unfortunately, Pulumi does not provide an out-of-box solution for these concerns (refer to Github issues #2309 and #2209). If we had a dependency graph of stacks, we could ensure correct order of deployment and also eliminate the risk of having stale stack references and outputs.

In this post I'll demonstrate how we could have such a dependency graph constructed for us by utilising a Turborepo pipeline, to build and deploy Pulumi stacks in a unified fashion. The techniques demonstrated in this post are platform-agnostic but for instructive reasons we'll be utilising AWS resources.

Find the code on GitHub.

Deploy all the things!

We have two Pulumi stacks, which are implemented as Node packages, that make up our infrastructure: networking, and service. The networking stack is in charge of the VPC, and the service stack is responsible for resources required by an arbitrary service:

// "networking" package.json
{
  "name": "infra-networking",
  "scripts": {
    "build": "tsc -p tsconfig.build.json",
    "deploy": "pulumi up --yes -s \"networking\""
  },
  "dependencies": {
    "@pulumi/aws": "^5.4.0",
    "@pulumi/awsx": "1.0.0-beta.7",
    "@pulumi/pulumi": "^3.33.1"
  }
}

First, we establish a dependency relationship between the packages by adding the networking package as a dependency of the service package:

{
  "name": "infra-service",
  "scripts": {
    "build": "tsc -p tsconfig.build.json",
    "deploy": "pulumi up --yes -s \"service\""
  },
  "dependencies": {
    "infra-networking": "*"
  }
}

Then, we add the deploy task in turbo.json, which depends on the build task:

{
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**"]
    },
    "deploy": {
      "dependsOn": ["^build", "build", "^deploy"],
      "outputs": []
    }
  }
}

From this point on, yarn turbo run deploy will calculate the dependency graph and ensure that the Pulumi stacks are deployed in the correct order (first the networking stack, followed by the service stack):

 Packages in scope: infra-networking, infra-service, node-config
 Running deploy in 3 packages
infra-networking:build: cache hit, replaying output 9e7586217b899874
infra-networking:build: $ tsc -p tsconfig.build.json
infra-networking:deploy: cache hit, replaying output 5be5b2bd52290212
infra-networking:deploy: $ pulumi up --yes -s "networking"
infra-networking:deploy: Previewing update (networking):
...
infra-service:build: cache hit, replaying output 00e33efbab132751
infra-service:build: $ tsc -p tsconfig.build.json
infra-service:deploy: cache hit, replaying output d3fa7c7bac6e2a40
infra-service:deploy: $ pulumi up --yes -s "service"
infra-service:deploy: Previewing update (service):

Can I have a bit of typesafety please?

Even though we've eliminated the risk of introducing stale stack outputs and references during deployment, there remains the issue of type safety with Pulumi's StackReference as StackReference#getOutput (and other similar methods) resolves to any:

// @pulumi/pulumi/stackReference.d.ts
export declare class StackReference extends CustomResource {
  /**
   * Fetches the value of the named stack output,
   * or undefined if the stack output was not found.
   * @param name The name of the stack output to fetch.
   */
  getOutput(name: Input<string>): Output<any>;
}

As it stands, we don't get any help from the Typescript compiler when consuming outputs via a StackReference. This certainly makes sense as Pulumi itself does not calculate a dependency graph and therefore can't help us resolve outputs statically. But, now that we do have the capability to calculate the dependency graph, it should be theoretically possible to resolve Stack outputs statically.

Concretely speaking, we want something like this:

// `infra-networking` package
// Assume that the `networking` stack creates the following outputs:
interface Outputs {
  vpcId: Output<number>;
  privateSubnetIds: Output<number[]>;
}

export const getOutput = <T extends keyof Outputs>(key: T): Outputs[T] => {
  const ref = new StackReference("networking");
  return ref.getOutput(key) as unknown as Outputs[T];
}

const vpc = new awsx.ec2.Vpc("my-vpc", {
  cidrBlock: `10.10.0.0/16`,
  numberOfAvailabilityZones: 1
});

export const vpcId = vpc.vpcId;
export const privateSubnetIds = vpc.privateSubnetIds;

// `infra-service` package
import { getOutput } from "infra-networking";
// TS compiler should help us spot invalid output keys during development
// and make sure that the `vpcId` constant infers the output type correctly
const vpcId = getOutput("vpcId");

At first glance the above looks very promising. But, soon enough we realise that anything exported from a Node module that's executed by Pulumi is a Stack output, and that it wouldn't make too much sense to export a function as an output. We also don't want to trigger Pulumi everytime the infra-networking package is imported in another Node package. Essentially, what we want is a Node package that can be used both as a regular Node dependency and a Pulumi executable. The solution to this boils down to having a Node package with multiple entrypoints. A naive solution for CJS modules may look like this:

export function onPulumiRun<T>(caller: typeof module, stack: string, main: () => T) {
  if (caller.parent?.path.endsWith("@pulumi/pulumi/cmd/run")) {
    caller.exports = main();
  }

  return createOutputGetter<T>(stack);
}

const stackReferences: Record<string, pulumi.StackReference> = {};
function createOutputGetter<T>(stack: string) {
  return <K extends keyof T>(key: Extract<K, string>): T[K] => {
    let stackRef = stackReferences[stack];
    if (!stackRef) {
      stackReferences[stack] = stackRef = new pulumi.StackReference(stack);
    }
    return stackRef.requireOutput(key) as unknown as T[K];
  };
}

And, here it is in action:

// `infra-networking` package
export const getNetworkingOutput = onPulumiRun(module, "networking", () => {
  const vpc = new awsx.ec2.Vpc("my-vpc", {
    cidrBlock: `10.10.0.0/16`,
    numberOfAvailabilityZones: 1,
  });

  return {
    vpcId: vpc.vpcId,
    privateSubnetIds: vpc.privateSubnetIds,
  };
});

// `infra-service` package
import { getNetworkingOutput } from "infra-networking";
const vpcId = getNetworkingOutput("vpcId");
vpcId.apply(console.log);

Asterix and Pedantix

There is more than one way to skin a cat and as such the techniques demonstrated in this post are just one example of establishing a dependency graph for Pulumi stacks and typesafety for StackReferences. Exploring other alternative solutions is encouraged.

Thanks for reading and happy coding!