Skip to main content

Using Custom Resources to Extend your CloudFormation

· 20 min read
Alex DeBrie

In a previous post, we looked at how to use CloudFormation Macros to provide a simpler DSL around CloudFormation or to provide company-wide defaults around particular resources.

However, sometimes you need more than what CloudFormation currently offers. Perhaps CloudFormation doesn't have support for a resource that you need. Or maybe you want to use a third-party resource, like Auth0 or Algolia, in your application.

In this post, we'll learn about CloudFormation custom resources. Custom resources greatly expand what you can do with CloudFormation as you can run custom logic as part of your CloudFormation deployment.

And with custom logic, you can do anything you want. *twirls moustache*

But custom resources can be complicated, and using them incorrectly can wreak havoc on your CloudFormation stack. In this post, we'll learn when, why, and how to use custom resources.

This post covers:

But background information will only take you so far. There's no subsitute for hands-on learning, so this post also includes two walkthroughs of creating custom resources:

This is a heavy one, so let's get started!

What are CloudFormation custom resources and when should I use them?

CloudFormation custom resources are bits of logic to run during the provisioning phase of your CloudFormation template. They allow you to extend CloudFormation to do things it could not normally do.

CloudFormation custom resources work by firing a webhook while processing your CloudFormation template. Your handler will receive this webhook and run any logic you want.

Because you are in charge of writing the logic in your custom resource handler, you have significant power in what you can do with CloudFormation custom resources.

Generally, CloudFormation custom resource behavior falls into one of the following buckets:

  1. Provisioning AWS resources that are not supported by CloudFormation.

    While CloudFormation coverage is pretty good, there are still gaps in support for resources. You can use custom resources to add in support for missing resources, allowing you to maintain infrastructure-as-code even where AWS doesn't allow it.

    A few examples in this bucket are:

    Tip: If you want to see other AWS resources that are unsupported in CloudFormation, check the reponses to this Twitter thread. I'm particularly grateful for Ben Bridt's CloudFormation Gaps repository on Github.

  2. Provisioning non-AWS resources with CloudFormation.

    The second reason to use custom resources is to add infrastructure-as-code properties to non-AWS resources.

    AWS is the Wal-Mart of the cloud, offering you a wide selection of resources in a single place. However, there are times when you need to use non-AWS solutions in your architecture. This is usually for one of two reasons.

    First, AWS may not offer a solution that you need. Examples here include an incident response platform, such as PagerDuty or certain types of database offerings, such as a time-series database (while Timestream is still in preview).

    Second, AWS may offer a solution in a category but perhaps a third-party solution better fits your needs. Examples here include:

    Using custom resources in this way nudges CloudFormation a little closer to Terraform. Like Terraform, you can provision resources across providers. However, you still retain the service-based nature of CloudFormation.

  3. Performing provisioning steps not related to infrastructure.

    A third category is to perform provisioning steps that aren't strictly infrastructure-related.

    The core example here is running relational database initialization or migration scripts. When deploying a new version of your application, you want to ensure that your database tables are created or that any recent migrations have been applied. This is a one-time operation on each deployment, but there's not a native AWS::Database::Script resource in CloudFormation.

    With custom resources, you could write a script in a Lambda function that is triggered after your RDS database is configured to execute any migration scripts needed.

    A second option in this category could be to bust a cache on the deployment of new code.

  4. Any. Thing. You. Want.

    The beauty (and danger) of custom resources is that you control the code, so you can do anything you please.

    Want to record a successful deployment in your deployment management system? You can do it.

    Want to use the ApproveAPI to require manual approval before starting a deploy? No problem.

    One of my favorite examples of innovative custom resource usage is from Chase Douglas at Stackery where he mentions running a smoke test in a custom resource as the very last step in a deploy. If the smoke test fails, it rolls back the entire deployment.

    These use cases are neat but remember that with great power comes great responsibility. Think carefully about how far you want to extend CloudFormation's capabilities.

How to use CloudFormation custom resources

Now that we know what custom resources are and when you might use them, let's see how to use custom resources.

We'll break this section into two parts. First, we'll see the overall architecture of custom resources and how they interact with other CloudFormation stacks. Then we'll do a deeper dive into the mechanics of writing a custom resource handler.

CloudFormation custom resource architecture

To use a CloudFormation custom resource, you'll need to do three things:

CloudFormation Custom Resource Usage

  1. Write the logic for your custom resource;

  2. Make your custom resource logic available by deploying to an AWS Lambda function or by subscribing to an SNS topic.

  3. Use the custom resource in your CloudFormation template that references the Lambda function or SNS topic.

To use a custom resource in a CloudFormation stack, you need to create a resource of either type AWS::CloudFormation::CustomResource or Custom::<YourName>. I prefer using the latter as it helps to identify the type of custom resource you're using.

Here's an example use of a custom resource:

Resources:
GithubWebhook:
Type: "Custom::GithubWebhook"
Version: "1.0"
Properties:
ServiceToken: arn:aws:lambda:us-east-1:123456789012:function:GithubCustomResource
Repo: alexdebrie/test-repo
Events: "push, pull_request"
Endpoint: https://webhook.api.com

Notice that the resource type is Custom::GithubWebhook, which is not a resource type provided natively by CloudFormation.

As inputs to your custom resource, you must provide a ServiceToken property. The ServiceToken is an ARN of either an AWS Lambda function or an SNS Topic that will receive your custom resource request. You may also include additional properties to send into your custom resource for configuration.

Writing a custom resource handler

Most of the tricky bits around custom resources is in actually writing the handler. There are a few "gotchas" which can leave your CloudFormation stack in a bad state.

In this section, we'll cover the custom resource programming model, the three event types for custom resources, and the inputs and outputs to your invocations.

Custom resource programming model

Custom resources are implemented in an asynchronous, callback-style programming model. It's important to understand what that means for your custom resource and its failure modes.

When your custom resource is invoked by CloudFormation, it won't hang around waiting for a response. As part of the payload to your custom resource, it will include a presigned S3 URL. When your custom resource is done processing, it should use the presigned S3 URL to upload a JSON object containing the output of the custom resource.

This asynchronous model makes it easier and faster for CloudFormation to provision many resources in a stack in parallel, but it also adds complexity. Rather than returning a simple response in your Lambda function, you need to save your output to S3. Forgetting to do so or saving the data incorrectly will cause CloudFormation to hang until it times out.

Event types

In writing a custom resource handler, you'll need to handle three different actions:

  • Create: A Create event is invoked whenever a resource is being provisioned for the first time, either because a new stack is being deployed or because it was added to an existing stack;

  • Update: An Update event is invoked when the custom resource itself has a property that has changed as part of a CloudFormation deploy.

  • Delete: A Delete event is invoked when the custom resource is being deleted, either because it was removed from the template as part of a deploy or because the entire stack is being removed.

Your handler function must be able to handle each of these event types and know how to return a proper response to avoid hanging your deployment.

Custom resources inputs and outputs

When your custom resource is invoked, it will include a payload similar to the following:

{
"RequestType": "Create",
"RequestId": "9db53695-b0a0-47d6-908a-ea2d8a3ab5d7",
"ResponseURL": "https://...",
"ResourceType": "Custom::GithubWebhook",
"LogicalResourceId": "GithubWebhook",
"StackId": "arn:aws:cloudformation:us-east-1:955617200811:stack/github-webhook-test-3/1351a360-4fd0-11e9-b201-0a20b68b404c",
"ResourceProperties": {
"Repo": "alexdebrie/test-repo",
"Events": ["push", "pull_request"],
"Endpoint": "https://webhook.api.com"
}
}

A few notable points:

  • The request type -- Create, Update, or Delete -- is shown in the RequestType parameter.

  • The ResponseURL parameter includes the presigned S3 URL for you to send your output.

  • The ResourceProperties parameter includes all of the properties passed into your resource in the template.

If the request type is Update or Delete, the payload will also include a PhysicalResourceId parameter. This is an identifier for the resource you create and is particularly important in Update scenarios. Check out the Tips and Tricks section below for more information on the PhysicalResourceId.

For the output that you write to the presigned S3 URL, it should look similar to the following:

{
"Status": "SUCCESS",
"RequestId": "9db53695-b0a0-47d6-908a-ea2d8a3ab5d7",
"LogicalResourceId": "GithubWebhook",
"StackId": "arn:aws:cloudformation:us-east-1:955617200811:stack/github-webhook-test-3/1351a360-4fd0-11e9-b201-0a20b68b404c",
"PhysicalResourceId": "GitHubWebhookZZ97363670ZZalexdebrie/alexdebrie.com",
"Data": {
"Id": "97363670"
}
}

Two important notes here:

  • The Status property indicates whether the custom resource succeeded or failed. You should provide SUCCESS for a successful run or FAILED for an unsuccessful run. If the run was unsuccessful, you may include a reason with the Reason property.

  • The Data property allows you to return outputs that can be referenced by other resources using the Fn::GetAtt function in CloudFormation.

There's a lot to take in with the custom resources, so check out the two examples below for a more complete walkthrough.

Tips and Tricks for writing Custom Resources

Below are a few key tips for writing resilient custom resources:

  • Catch every exception to prevent hanging CloudFormation stacks

    Remember that custom resources use an asynchronous, callback-driven model. If your custom resource handler has an uncaught error that prevents it from writing a result to S3, your CloudFormation stack will remain in the CREATE_IN_PROGRESS stage until it times out.

  • Use a helper library

    Managing a custom resource can be tricky, both due to the exception problem noted above and because you need to write your data to S3 using a presigned S3 URL.

    Fortunately, there are a number of libraries that ease the burden of writing custom resources. A few of them are:

    • custom-resource-helper: a Python-based library provided by AWS that uses decorators;

    • cfn-wrapper-python: another Python-based library that was the inspiration for custom-resource-helper. Written by Ryan Scott Brown, an all-around AWS wizard.

    • cfn-lambda: For our Node.js friends, cfn-lambda provides an easy way to build custom resources with JavaScript.

    • cfn-custom-resource: Another Python-based library, this one uses classes over decorators. Created by Ben Kehoe, robot hacker and the Godfather of serverless architecture.

    While all of these libraries are solid, the two examples below use the custom-resource-helper library.

  • Understand how the Physical Resource Id works

    After creating or updating your custom resource, you'll need to return a PhysicalResourceId property. This property is important, as it can be used to identify a created resource apart from its input properties.

    In the Github webhook example below, we use the Physical Resource Id to encode the Id of the GitHub webhook. You cannot look up a GitHub webhook without the Id, so it would be difficult to perform an update operation on an existing webhook without that Id.

    Encoding the webhook Id into the Physical Resource Id allows us to identify and update an existing webhook when its input properties change.

  • Use AWS Lambda for your handler

    While you can use an SNS topic as the ingest mechanism for custom resource requests, I recommend using Lambda functions unless you have a strong need otherwise.

    A custom resource is basically a webhook, and webhooks are one of the core use cases for AWS Lambda. You won't have any management burden associated with it, and your custom resource is essentially free given Lambda's pricing structure.

Walkthrough: Provisioning a Github Webhook with CloudFormation

We've done a lot of background on custom resources, but there's no substitute for actually walking through some examples.

In this first example, we'll use CloudFormation to provision a Github webhook. This falls into the second use case we discussed for when to use custom resources -- Provisioning non-AWS resources with CloudFormation. A custom resource gives us the same infrastructure-as-code mechanics that we love even with non-AWS resources.

Custom resource logic

We will use the custom-resource-helper library to assist in building our logic. It helps with a few things:

  • Capturing errors and handling failures gracefully;

  • Writing output to the S3 presigned URL;

  • Logging output for easier debugging;

  • Easy polling for long-running provisioning tasks.

A skeleton file for starting with the custom-resource-helper is as follows:

from crhelper import CfnResource

helper = CfnResource(
json_logging=False,
log_level='DEBUG',
boto_level='CRITICAL'
)


def handler(event, context):
helper(event, context)


@helper.create
def create(event, context):
logger.info("Got Create")

# Items stored in helper.Data will be saved
# as outputs in your resource in CloudFormation
helper.Data.update({"test": "testdata"})
return "MyResourceId"


@helper.update
def update(event, context):
logger.info("Got Update")
return "MyNewResourceId"


@helper.delete
def delete(event, context):
logger.info("Got Delete")

You'll create a CfnResource object with some options. In your Lambda's entrypoint handler() function, you pass the event and context to the CfnResource for handling all control flow.

Then, for each of the Create, Update, and Delete request types, you make a function wrapped with a decorator to handle the request. The custom-resource-helper library will call the proper function depending on the request type.

Posting the full logic here would get a little verbose, so I'll spare your eyeballs. You can see the handler logic here, and it's fairly basic -- around 120 lines of code.

I do want to call out one aspect. A Github webhook is tied to a particular repo and is identified by a unique Id provided by Github. Thus, there's a little bit of state involved with maintaining this resource to ensure proper updates and deletes.

To handle this state, I used the PhysicalResourceId property that is returned by the custom resource to our CloudFormation template. This will be passed in for future updates and deletes, so I can tell if the resource has fundamentally changed (e.g. by changing the repository to which it applies). I can also use it to store the Id for updating or deleting a particular webhook.

For now, I'm just encoding the data as GithubWebhookZZ{Id}ZZ{Repo}. I use ZZ as a cheap separator, partly because I initially misread the instructions on what characters were allowed in a Physical Resource Id. 😁 A more standard approach might use other characters as separators (e.g. $, _, or -).

Deploying the custom resource

To deploy the custom resource, I use the Serverless Framework. My serverless.yml file looks as follows:

service: gh-custom-resource

provider:
name: aws
runtime: python3.7
stage: dev
region: us-east-1
environment:
GITHUB_TOKEN: "" # <-- Add your token here!

functions:
githubWebhook:
handler: handler.handler

resources:
Outputs:
GitHubWebhookFunction:
Description: "ARN for Github Webhook custom resource function"
Value: !GetAtt GithubWebhookLambdaFunction.Arn
Export:
Name: "GithubWebhookFunction"

plugins:
- serverless-python-requirements

It deploys a single function, then registers the ARN of that function as a CloudFormation export so that I can import the value into another CloudFormation stack in my account.

Note that you'll need to provision your own Github token before deploying.

Using the custom resource in another template

Once the custom resource is deployed and exported, we can easily use it in another template.

Here's an example CloudFormation template for using our custom webhook:

AWSTemplateFormatVersion: "2010-09-09"
Description: Example template for using the Github Webhook custom resource
Parameters:
REPO:
Type: String
Description: The Github repository for which the webhook is configured
EVENTS:
Type: CommaDelimitedList
Description: Events for which you want to subscribe
Default: "push, pull_request"
ENDPOINT:
Type: String
Description: The endpoint to which events will be sent

Resources:
GithubWebhook:
Type: "Custom::GithubWebhook"
Version: "1.0"
Properties:
ServiceToken: !ImportValue GithubWebhookFunction
Repo: !Ref REPO
Events: !Ref EVENTS
Endpoint: !Ref ENDPOINT

Note that we are provisioning a single resource in the Resources section. The ServiceToken is the only required property, and we use the ImportValue CloudFormation function to use the exported value from our other stack.

We can deploy this template using the following command:

aws cloudformation deploy \
--template-file template.yaml \
--stack-name github-webhook-test \
--parameter-overrides REPO=alexdebrie/alexdebrie.com ENDPOINT=http://requestbin.fullcontact.com/z0azobz0

Make sure you paste in your own unique values for REPO and ENDPOINT in the parameter overrides.

After a few minutes, you should see the webhook configured in your repository:

Github webhook

Boom! 💥 Github webhooks infrastructure-as-code!

Walkthrough: Provisioning and Validating an ACM Certificate

Hat tip to Richard Boyd for his assistance with this example. Check out his blog here.

One example isn't quite enough, so let's do another. In this second example, we're going to use a custom resource to provision and validate an SSL certificate with AWS Certificate Manager.

This use case fits more into either the first or third bucket mentioned above. This could be considered provisioning an AWS resource for which there is not CloudFormation support (first bucket), but there is CloudFormation support for creating an ACM Certificate. There's just not support for validating that certificate. That might put it more in the third bucket -- performing provisioning steps not related to infrastructure.

Tomato, to-mah-to -- the important thing is that we can automate something that was previously manual.

In this example, we also see how the custom-resource-helper helps us with long-running provisioning steps that may rely on waiting for other systems to complete a task.

Let's get started.

Custom resource logic -- polling for slower resources

I'm only going to highlight the important parts of the logic here. Feel free to check out all the custom resource code here.

In our create() function for our custom resource, we'll be doing the following things:

  1. Requesting an ACM certificate and specifying DNS validation;

  2. Creating the DNS record in Route53 to validate our certificate;

  3. Waiting for the certificate to be marked verified in ACM.

Notably, there's a potentially large gap between steps 2 and 3. ACM states it can take up to 30 minutes for the DNS record to propogate and for the certificate to be verified.

With Lambda, this is a problem. The max duration for Lambda is only 15 minutes. 😱 How can we handle this?

Fortunately, the custom-resource-helper library makes it easy. In addition to the normal create() function, you can add an optional poll_create() function. The syntax is as follows:

@helper.create
def create(event, context):
# All your normal create logic here

# Add the certificate arn to the
# Data object on the helper.
helper.Data.update({"Arn": cert_arn})

return


# In the poll_create function, check
# to see if the certificate is validated.
@helper.poll_create
def poll_create(event, context):
cert_arn = event['CrHelperData']['Arn']
acm = _client(event, "acm")
validated = _await_validation(cert_arn, acm)
if validated:
return True

return False

I have both a create() function and a poll_create() function. The create() function will be run first when I get a Create request for my custom resource. In addition to running its logic, it will also create a CloudWatch Scheduled Event that will re-trigger my function in two minutes.

That re-trigger will run the poll_create() function. If I return a truthy value from that function, it will tear down the CloudWatch Scheduled Event and write the custom resource output to the presigned S3 URL.

If I return a falsey value, the function will be retriggered in 2 minutes to try again.

Let's walk through an example.

Imagine that validating my certificate takes 5 minutes. The flow would look as follows:

  1. The initial request would come in and run the create() logic. This makes the request to create the certificate and add the DNS record. Additionally, the custom-resource-helper library configures a CloudWatch Scheduled Event to trigger this function in two minutes with the same input. After all this happens, the function finishes while the CloudFormation stack is still awaiting a response.

  2. Two minutes later, the function is triggered again. This time it runs the poll_create() function. The certificate still isn't validated, so the function completes without writing a result to S3.

  3. Two minutes later, the function is triggered a third time. It runs the poll_create() logic again but the certificate still isn't ready.

  4. Two minutes later, the function is triggered a fourth time. It runs the poll_create() logic again. This time, the certificate is ready. The custom-resource-helper tears down the CloudWatch Scheduled Event so that it won't trigger again, then it writes the custom resource's output to the presigned S3 URL.

This polling logic is extremely helpful. You won't be paying for idle compute in your Lambda function, and you don't need to worry about hitting the Lambda timeout. Hurrah!

Deploying and usage

There are instructions in the Github repo for deploying the custom resource. It's using AWS SAM to deploy the stack, but the principles are similar -- deploy a Lambda function and register the function's ARN as an Export.

Once your function is deployed and registered, you can use the following stack to test it out:

AWSTemplateFormatVersion: "2010-09-09"
Description: Example template for using the ACM custom resource
Parameters:
DOMAIN:
Type: String
Description: Domain used for certificate
RECORD:
Type: String
Description: Record used for certificate

Resources:
ACMCertificate:
Type: "Custom::ACMCertificate"
Version: "1.0"
Properties:
ServiceToken: !ImportValue ACMRegisterFunction
Region: !Ref "AWS::Region"
HostedZoneName: !Ref DOMAIN
RecordName: !Ref RECORD

It takes DOMAIN and RECORD parameters to indicate the certificate you want to provision.

You can deploy the template using the following command:

aws cloudformation deploy \
--template-file template.yaml \
--stack-name acm-register-test \
--parameter-overrides DOMAIN=<DOMAIN> RECORD=<RECORD>

Make sure to use your own values for DOMAIN and RECORD.

For example, if you wanted to create a certificate for api.my-app.com, you would use:

aws cloudformation deploy \
--template-file template.yaml \
--stack-name acm-register-test \
--parameter-overrides DOMAIN=my-app.com RECORD=api

Your stack will likely take about 5-10 minutes to complete. After that, you should see a verified ACM certificate in the AWS console!

Conclusion

CloudFormation custom resources are awesome for filling gaps in the CloudFormation ecosystem or for bringing third-party resources under the CloudFormation umbrella.

In this post, we learned what custom resources are and when you would want to use them. Then, we learned about the workflow for creating and using CloudFormation custom resources, as well as some tips and tricks. Finally, we walked through two examples of custom resources.