TL/DR;
It’s best to test Lambda "inside-out" by fist making sure the Lambda works, then the invocation (in this case API Gateway), then external. Pipeline error logging and CloudWatch logging are your best friends for troubleshooting.
Background
In this series of posts we walked through the following steps for using the AWS Serverless Application Model (SAM) for setting up a GitHub Actions pipeline for deploying serverless functions written in Node.js. Previous posts covered:
- Part 1: The Deployment Stack – Writing a reusable CloudFormation template to create required AWS resources for the pipeline
- Part 2: The Code – Writing a basic REST API in Node.js
- Part 3: The Pipeline – Configuring the GitHub pipeline generated by SAM to work with Node.js and different branch naming conventions
When building a SAM application, you have two choices for how you configure your API gateway: Api
and HttpApi
. You can review the AWS Documentation for a comparison of these two options. We will discuss techniques that apply to each of these options.
Linting
I can't say this enough. When you are looking for problems with your application, Look for horses and not zebras. The longer you spend trying to track down the cause of a problem without finding a solution, the more likely the answer is staring you right in the face. Very often, problems can be found with static code analysis ("linting") and specifically eslint if you are using Node.js. While working on the proof of concept for this post, I chased a bug way too long that was just an invalid reference inserted by my IDE. The error message pointed me to the exact line but since the error message said it couldn’t find a reference (and I thought I hadn’t added any new references), I thought there was something wrong with loading the dependencies. Since the code I was using for this series was so simple, I didn't bother adding eslint. As soon as I did, I found the issue since it highlighted the unused reference introduced by the IDE. The moral of the story is use linting to find easily-fixed problems.
Unit Testing
I don't want to get into a philosophical conversation on what is and is not a unit test. I might venture into that conversation another day. For the sake of this post, let's consider "unit testing" analogous to "local testing" – any test than can be run outside of the AWS ecosystem. This way you can run the test in your local dev environment or in the pipeline. These test are extremely important to successful development for microservices and the cloud. You need to be able to test the atomic transaction your lambda is supposed to perform. The great part of Lambda is that you can invoke the code multiple ways. The same function can be invoked from an API Gateway like in our example or from an SQS queue, SNS topic, CloudWatch event, etc. If your code works, it should work across any use case. Of course if you are integrating with other AWS services like S3 or need network connectivity, then the permissions and resources need to be configured correctly in AWS. However none of this matters if there is a bug in your code. Test your code thoroughly and shoot for 100% code coverage even if that means your "unit test" smells more like an "integration test" (ex: use docker run or docker-compose to spin up a database in a container rather to test CRUD transactions). Structure your code based on business logic and then have a handler function that only handles the routing of parameters from the event
parameters to your function(s). Then test this function.
Note that you can also run Lambda functions in a container. The easiest way to do this is with sam local invoke
which will use the information in your SAM template to create the Lambda Function in a container. For this simple Hello World example, I think this is a perfectly valid technique. However, as you start adding other AWS services to your Lambda, you would need to extend any permissions needed to run the Lambda to an access key shared with the developer (i.e the developer’s persona IAM user and role). In other words, you have achieve all of the same security requirements in your local environment that need to be met in the AWS account. I would argue you are better off always running in a dev AWS account rather than locally. This might seem unnecessarily painful at first, but if you follow these other testing techniques, you are very unlikely to have issues and you will actually move more quickly since everything is developed and tested within the ecosystem of the pipeline and AWS account so you don’t run into configuration problems due to differences between the local and AWS environment.
Validating the SAM Template
Even though we are using SAM for the GitHub Actions pipeline, you can follow all of the steps in this series of posts without every having to use the SAM CLI. This is by design. I am a firm believer that you should be able to develop for AWS using only code and your standard development tools and services so these posts are intended to document a process that follows that belief. However, since we are using SAM for deployment, it is good to use SAM for local features where it makes sense. Before you can run SAM, you will need to make sure you have installed it as defined in the AWS documentation. To validate your SAM template (template.yaml
in our example), simply run sam validate. Note that you may need to specify your region with the --region
option if you have not configured this in you default AWS configuration. If this is the case, sam validate
will respond with this information. Below is an example of an error found using sam validate
:
$ sam validate
2021-11-04 11:17:30 Loading policies from IAM...
2021-11-04 11:17:32 Finished loading policies from IAM.
Template provided at '/Users/doug/code/aws-sam-demo/template.yaml' was invalid SAM Template.
Error: [InvalidResourceException('LambdaNodeApi', "Invalid value for 'Cors' property.")] ('LambdaNodeApi', "Invalid value for 'Cors' property.")
In this example, I had used the AllowOrigin
key inside the CorsConfiguration
section for an HttpApi
but the correct key is AllowOrigins
. Note that this error does not point you to the exact line so it is important to review the exact syntax for the section referenced by the error.
If everything is good, you should see output that looks something like this:
$ sam validate
2021-11-04 10:58:12 Loading policies from IAM...
2021-11-04 10:58:14 Finished loading policies from IAM.
/Users/doug/code/aws-sam-demo/template.yaml is a valid SAM Template
Troubleshooting Lambda Function
Testing Invocation
You can test your Lambda function in the AWS console. Navigate to the function (you can start with the CloudFormation stack for your most recent deployment if you aren’t sure about the name of your function) and select the "Test" tab. The default test data will be based off of the "hello-world" template. This does NOT match the schema for a call from an API Gateway so you will most likely need to modify the data to match the values your code expects from the event
parameter.
Finding information on the syntax of the event
parameter for your Lambda function is surprisingly difficult since there are multiple ways to invoke a Lambda function and each option has its own unique schema for the event
value. This matrix provided in the AWS documentation points to all of the various invocation methods. Since we are invoking our Lambdas from an API Gateway in our example, you might want to review the schema for the event
as defined for API Gateway invocation provided in the AWS documentation.
Debug Logging
By default, logs and metrics are enabled for all Lambda functions created with AWS SAM. To view the CloudWatch logs, simply navigate to the Lambda function, select the "Monitor" tab, and then click "View logs in CloudWatch". Typically, there will be a unique log stream for each invocation of your function. Select the log stream to see the logs. Note that anything you write to the standard output (console.log
in Node.js) is written to the CloudWatch log stream.
Note that even though the event
schema is defined and documented, not all values are implemented for every configuration or use case. Therefore you may want to write the event value to the CloudWatch logs as follows:
console.log(event);
Troubleshooting Api Option
Testing via AWS Console
The Api
option supports testing directly in the AWS console. You can navigate to the API gateway (again, the CloudFormation stack is your friend here) and select “Resources” from the menu and then select a method ("GET" in our example). Then click the lightning bold icon to access the test page. On this page, you can enter any content required for the request (path/query parameters, headers, body, etc.) and then click the "Test" button to test the API.
After you test the API, you will see the response status, body, headers, and log output on the right-hand side of the page. One important item in the log output is Endpoint request body after transformations
. This will show you the value of the event
parameter passed to your Lambda function. You can also see the return value of your Lambda displayed as Endpoint response body before transformations
. If your response status or body isn’t what you expected, you should review your response object compared to the syntax as defined in the AWS documentation.
Troubleshooting HttpApi Option
Debug Logging
Before you can log API Gateway activity for APIs created with the HttpApi
option, you have to enable CloudWatch at the account level. Review the this gist which you can add to the deployment stack template as we did in Part 1.
One key benefit of the HttpApi
option is support for generic JWT authorizers which are convenient if your are using a third-party authentication provider such as Auth0. The HttpApi
supports a FailOnWarnings
property which defaults to false
. You can change this value to true
as shown below:
LambdaNodeApi:
Type: AWS::Serverless::HttpApi
Properties:
FailOnWarnings: true
Enabling this setting will provide information on "Warnings" which could actually be preventing AWS from creating resources required for your API to function. The example below shows a failure in the pipeline that occurred during sam deploy
due to missing the Audience
configuration for the JWT authorizer.
-------------------------------------------------------------------------------------------------
ResourceStatus ResourceType LogicalResourceId ResourceStatusReason
-------------------------------------------------------------------------------------------------
UPDATE_IN_PROGRESS AWS::ApiGatewayV2::Api LambdaNodeApi -
UPDATE_FAILED AWS::ApiGatewayV2::Api LambdaNodeApi Warnings found during
import: CORS
Scheme is malformed,
ignoring.
Unable to create
Authorizer 'LambdaNode
Authorizer': Audience
list must have at
least 1 item for JWT
Authorizer. Ignoring.
Unable to put method
'GET' on resource at
path '/': Invalid
authorizer definition.
Setting the
authorization type to
JWT requires a valid
authorizer. Ignoring.
(Service:
AmazonApiGatewayV2;
Status Code: 400;
Error Code:
BadRequestException;
Request ID: 43a34e55-d
0d0-4ed2-8571-eb473e71
a9e2; Proxy: null)
(Service: null; Status
Code: 404; Error Code:
BadRequestException;
Request ID: null;
Proxy: null)
Remote Testing
The final phase of testing is to actually execute the API "in the field". This can be done using a tool such as Postman or curl
.
What Do You Mean CloudFront Error?
One error to watch out for when testing your function through the API Gateway is a CloudFront error. This may seem like a strange error since nowhere in this series of posts do we mention CloudFront, but I have a skill for finding strange errors that aren’t easy to find solutions for. I invoked a function that supported path parameters but passed in an invalid path and a JWT for an authorizer that was also not expected. The body of the response looked like this:
{
"message": "'[JWT VALUE GOES HERE]' not a valid key=value pair (missing equal-sign) in Authorization header: 'Bearer [JWT VALUE GOES HERE]'."
}
More confusing was the X-Cache
header in the response which stated "Error from cloudfront". My invalid request was being blocked by CloudFront which sits in front of your API Gateway as part of the AWS infrastructure. This was particularly difficult to discover since CloudFront was blocking the API from being called so even once I enabled logging for my API Gateway built on the HttpApi
option, I was still not seeing any activity (or error).
Summary
I recommend an "inside-out" approach to troubleshooting as follows:
- Use static code analysis or "linting".
- Structure your code based on business logic and shoot for 100% code coverage testing this code.
- Validate your SAM template with
sam validate
. - Test your Lambda function in the AWS console.
- Use debug logging (possibly logging the
event
parameter) to troubleshoot the Lambda. - Test APIs created using the
Api
option using the AWS console. - Use
FailOnWarnings
and enable CloudWatch logs for to find issues with theHttpApi
option. - Test from outside AWS with a tool such as
curl
or Postman. - CloudFront errors usually mean you are sending a request that is way off target for your API Gateway (probably calling the wrong API).