How can I quickly and effectively debug CloudFormation templates?

Amazon Web-ServicesAmazon Cloudformation

Amazon Web-Services Problem Overview


CloudFormation is a powerful AWS offering that allows the programmatic creation of AWS resource stacks, such as the web tier of an application, a high performance computing cluster, or an entire application stack, with a single API call. It is immensely powerful. Using it is surely considered a good AWS practice, especially when it's combined with Chef, Puppet, or cloud-init. Debugging it drives me to vice.

Take a production example: The stock mongodb cluster templates won't work for me. I don't particularly know why. I'm sure it's something simple as it almost always is. My problem isn't that I can't figure out what's wrong. It's that it takes the stack between 20 and 30 minutes to fail, and then another three or four minutes to delete, assuming it deletes the resources properly at all.

What am I missing? I know about the --disable-rollback flag and use it like oxygen. I learned long ago to wrap exit messages with cfn-signal and to throw them like ballast off a sinking ship. How can I make the template debugging process faster, or am I stuck forever noticing my mistakes half an hour after I make them?

Amazon Web-Services Solutions


Solution 1 - Amazon Web-Services

Use the aws cloudformation validate-template command in the AWS CLI tool. It only validates whether your template is valid JSON or YAML, not whether your keys and values are correct (for example doesn't check for typos in keys)

Solution 2 - Amazon Web-Services

Another option, a year later, is to abstract these templates to a 3rd party library, such as troposphere. That library constructs the JSON payload for you, and does a lot of validation along the way. This also solves the "Wow managing a 1000-line JSON file sure is sad" problem.

Solution 3 - Amazon Web-Services

> How can I make the template debugging process faster, or am I stuck forever noticing my mistakes half an hour after I make them?

Here are a few best-practice suggestions, focusing specifically on improving the iteration speed of complex CloudFormation-template development:

Use CloudFormation tools to validate templates and stack updates

AWS has already outlined these in its own Best Practices document, so I won't repeat them:

The point of this step is to catch obvious syntax or logical errors before actually performing a Stack creation/update.

Test Resources in isolation

Before using any individual CloudFormation Resource in a complex Stack, make sure you thoroughly understand the full extent of that Resource's creation/update/delete behavior, including any limits on usage and typical startup/teardown times, by testing their behavior in smaller, standalone Stacks first.

  • If you are developing or using any third-party Custom Resources, write unit tests using appropriate libraries for the language platform, to make sure the application logic behaves as expected across all use-cases.
  • Be aware that the amount of time for an individual Resource to create/update/delete can vary widely between Resource Types, depending on the behavior of the underlying API calls. For example, a complex AWS::CloudFront::Distribution resource can sometimes take 30-60 minutes to create/update/delete, while an AWS::EC2::SecurityGroup updates in seconds.
  • Individual Resources may have bugs/issues/limitations in their implementation, which are much easier to debug and develop workarounds for when tested in isolation, rather than within a much larger Stack. Keep in mind limitations such as AWS Service Limits depending on your individual AWS Account settings, or Region Availability of services depending on the Region within which you create your Stack.
Build complicated stacks in small increments

When performing a Stack creation/update, a failure in any single Resource will cause the Stack to rollback the entire set of Resource changes, which can unnecessarily destroy other successfully-created Resources and take a very long time when building a complicated stack with a long dependency-graph of associated Resources.

The solution to this is to build your Stack incrementally in smaller Update batches, adding Resources one (or a few) at a time. This way, if/when a failure occurs in a resource creation/update, the rollback doesn't cause your entire Stack's resources to be destroyed, just the set of Resources changed in the latest Update.

Monitor the progress of stack updates

Be sure to Monitor the Progress of your Stack Update by viewing the stack's events while a creation/update is performed. This will be the starting-point for debugging further issues with individual resources.

Solution 4 - Amazon Web-Services

Have you looked at the AWS CloudFormation Template Editor that is included in the AWS Toolkit for Eclipse? It has syntax highlighting, statement completion, and deployment to AWS CloudFormation.

Solution 5 - Amazon Web-Services

The AWS CloudFormation linter provides additional static analysis beyond aws cloudformation validate-template

It will inform you which resource types and instance types are unavailable in certain regions, validate property values against allowed values, catch circular resource dependencies, syntax errors, template limits, and much more

In addition to the CLI, one of the most popular mechanisms to remember to run the linter is installing an editor plugin like the Visual Studio Code extension which runs on every file save

Other mechanisms like pre-commit Git hooks are described here

Visual Studio Code extension example screenshot

Solution 6 - Amazon Web-Services

Late to the party but I might also add that it is worthwhile spending a bit of time configuring and learning your editor. I know that sounds laughably basic as an answer but try it.

In my case, with vim, I performed much better once I took some time installing json syntax plugins, and also (finally) understood folding techniques to navigate large CF files easily. Mine now suggests typos (commas where they shouldn't be etc) and the color highlighting saves a lot of time giving clear visual clues.

This might help mitigate syntax errors, but in-template logical errors are better fixed by other tools. Hopefully one day there will be a "preview" mode on CF.

Solution 7 - Amazon Web-Services

For JetBrains IDEs (IntelliJ IDEA PhpStorm WebStorm PyCharm RubyMine AppCode CLion Gogland DataGrip Rider Android Studio ), there is at AWS CloudFormation plugin that supports deep checking of JSON and YAML CFN templates

Solution 8 - Amazon Web-Services

If you are dealing with EC2 machines, then I would recommend you to login to the EC2 machine and tail the boot.log file (/var/log/boot.log in RHEL6/Centos). This file gets updated with all your shell activities (activities like: installation, downloading files, copying files etc.).

Also, use editors like http://www.jsoneditoronline.org/ to get a TREE representation of your JSON. This helps you to check the order of JSON elements.

And when you update files always use tools like http://www.git-tower.com/blog/diff-tools-mac/ or an actual version control system to ensure that you did not accidentally change something which might break your script.

Solution 9 - Amazon Web-Services

In addition to the AWS CLI aws cloudformation validate-template command there is a node-based cfn-check tool that does deeper validation.

Solution 10 - Amazon Web-Services

A recent new feature added to Cloudformation this past December was the addition of additional Parameter Types. These new Types allow your templates to perform stronger data checking, and also can "fail-fast" when creating resources and nested Cloudformation stacks. You also have the ability to provide nicer human-readable custom error messages when invalid values are passed in using the new ConstraintDescription attribute.

The new types are especially helpful when dealing with various VPC resources. You can ensure that Parameters for your templates are the correct type, and are explicit about expecting a single value vs. a List.

For example:

"Parameters" : {
  "SingleGroup": { "Type": "AWS::EC2::SecurityGroup::Id", ...},
  "GroupList": {"Type": "List<AWS::EC2::SecurityGroup::Id>", ...}
}

Solution 11 - Amazon Web-Services

Please checkout my cloudformation validator at https://pypi.org/project/cloudformation-validator/

This will validate the schema and then validate again a list of rules, and allow for custom rules. I also allows for easy integration with deployment tools.

Solution 12 - Amazon Web-Services

You can also make use of the CloudFormation Designer available from amazon here: https://console.aws.amazon.com/cloudformation/designer/home?region=us-east-1

Simply paste your template (JSON) on the "Template" pane and then click on the tick symbol to validate your template. Any errors will show up in the "Error" pane.

Hope this helps.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionChristopherView Question on Stackoverflow
Solution 1 - Amazon Web-ServicesLuciano IssoeView Answer on Stackoverflow
Solution 2 - Amazon Web-ServicesChristopherView Answer on Stackoverflow
Solution 3 - Amazon Web-ServiceswjordanView Answer on Stackoverflow
Solution 4 - Amazon Web-ServicesWade MatveyenkoView Answer on Stackoverflow
Solution 5 - Amazon Web-ServicesPat MyronView Answer on Stackoverflow
Solution 6 - Amazon Web-ServicesAitchView Answer on Stackoverflow
Solution 7 - Amazon Web-ServicesJasonView Answer on Stackoverflow
Solution 8 - Amazon Web-ServicesChaitanyaBhattView Answer on Stackoverflow
Solution 9 - Amazon Web-ServicesJasonView Answer on Stackoverflow
Solution 10 - Amazon Web-ServicesMikelaxView Answer on Stackoverflow
Solution 11 - Amazon Web-ServicesWillRubelView Answer on Stackoverflow
Solution 12 - Amazon Web-ServicesVictorProView Answer on Stackoverflow