Access AWS S3 from Lambda within VPC

Amazon Web-ServicesAmazon S3Aws LambdaAmazon Vpc

Amazon Web-Services Problem Overview


Overall, I'm pretty confused by using AWS Lambda within a VPC. The problem is Lambda is timing out while trying to access an S3 bucket. The solution seems to be a VPC Endpoint.

I've added the Lambda function to a VPC so it can access an RDS hosted database (not shown in the code below, but functional). However, now I can't access S3 and any attempt to do so times out.

I tried creating a VPC S3 Endpoint, but nothing has changed.

VPC Configuration

I'm using a simple VPC created by default whenever I first made an EC2 instance. It has four subnets, all created by default.

VPC Route Table

_Destination - Target - Status - Propagated_

172.31.0.0/16 - local - Active - No

pl-63a5400a (com.amazonaws.us-east-1.s3) - vpce-b44c8bdd - Active - No

0.0.0.0/0 - igw-325e6a56 - Active - No

Simple S3 Download Lambda:

import boto3
import pymysql
from StringIO import StringIO

def lambda_handler(event, context):
    s3Obj = StringIO()

    return boto3.resource('s3').Bucket('marineharvester').download_fileobj('Holding - Midsummer/sample', s3Obj)

Amazon Web-Services Solutions


Solution 1 - Amazon Web-Services

There is another solution related to VPC endpoints.

On AWS Console, choose VPC service and then Endpoints. Create a new endpoint, associate it to s3 service

VPC S3 endpoint selection

and then select the VPC and Route Table.

Then select access level (full or custom) and it will work.

Solution 2 - Amazon Web-Services

With boto3, the S3 urls are virtual by default, which then require internet access to be resolved to region specific urls. This causes the hanging of the Lambda function until timeout.

To resolve this requires use of a Config object when creating the client, which tells boto3 to create path based S3 urls instead:

import boto3 
import botocore
 
client = boto3.client('s3', 'ap-southeast-2', config=botocore.config.Config(s3={'addressing_style':'path'}))

Note that the region in the call must be the region to which you are deploying the lambda and VPC Endpoint.

Then you will be able to use the pl-xxxxxx prefix list for the VPC Endpoint within the Lambda's security group, and still access S3.

Here is a working [CloudFormation script][1] that demonstrates this. It creates an S3 bucket, a lambda (that puts records into the bucket) associated to a VPC containing only private subnets and the VPC Endpoint, and necessary IAM roles.

[1]: https://github.com/gford1000-aws/lambda_s3_access_using_vpc_endpoint "CloudFormation script"

Solution 3 - Amazon Web-Services

There's another issue having to do with subnets and routes that is not addressed in the other answers, so I am creating a separate answer with the proviso that all the above answers apply. You have to get them all right for the lambda function to access S3.

When you create a new AWS account which I did last fall, there is no route table automatically associated with your default VPC (see Route Tables -> Subnet Associations in the Console).

So if you follow the instructions to create an Endpoint and create a route for that Endpoint, no route gets added, because there's no subnet to put it on. And as usual with AWS you don't get an error message...

What you should do is create a subnet for your lambda function, associate that subnet with the route table and the lambda function, and then rerun the Endpoint instructions and you will, if successful, find a route table that has three entries like this:

Destination 	Target
10.0.0.0/16 	Local
0.0.0.0/0 	    igw-1a2b3c4d
pl-1a2b3c4d 	vpce-11bb22cc

If you only have two entries (no 'pl-xxxxx' entry), then you have not yet succeeded.

In the end I guess it should be no surprise that a lambda function needs a subnet to live on, like any other entity in a network. And it's probably advisable that it not live on the same subnet as your EC2 instances because lambda might need different routes or security permissions. Note that the GUI in lambda really wants you to have two subnets in two different AZs which is also a good idea.

Solution 4 - Amazon Web-Services

The cause of my issue had been not properly configuring the Outbound Rules of my security group. Specifically, I needed to add Custom Protocol Outbound Rule with a destination of pl-XXXXXXXX (the S3 service. The actual value was provided by the AWS Console).

Solution 5 - Amazon Web-Services

I just wanted to add one other answer amongst the others, which might affect those running functions with slow cold start times.

I'd followed all the instructions about setting up a gateway for S3, but still it didn't work. I created a test Node.js function which simply listed the buckets - I verified that this didn't work without the S3 gateway, but did once the gateway was established. So I knew that part of things was working fine.

As I was debugging this I was changing the timeout of the function to ensure the function was updated and I was using the latest version of the code when invoking and testing.

I'd reduced the timeout to 10s, only it turned out my function needed more like 15s on cold boot. Once I'd increased the timeout again, it worked.

Solution 6 - Amazon Web-Services

To access S3 from within the Lambda function which is within a VPC, you can use a Natgateway (a much expensive solution in comparison to the VPC endpoint ). If you have two private subnets within the VPC, (where subnets are having a route to a NAT gateway ) and associate them with the Lambda, it can access the S3 bucket like any Lambda which are outside the VPC. Gotchas

  1. If you associate a public subnet with the Lambda expect it to work, it will not.
  2. Make sure your security group is in place to accept ingress.

This approach will make any service available in the internet accessible to the Lambda function . For detailed steps you can follow this blog https://blog.theodo.com/2020/01/internet-access-to-lambda-in-vpc/

Solution 7 - Amazon Web-Services

Adding to the answer from Luis RM, this is a construct that can be used in CDK:

 const vpcEndpoint = new ec2.GatewayVpcEndpoint(this, 'S3GatewayVpcEndpoint', {
      vpc: myVpc,
      service: { name: 'com.amazonaws.us-west-1.s3' },
    })

    const rolePolicies = [
      {
        Sid: 'AccessToSpecificBucket',
        Effect: 'Allow',
        Action: [
          's3:ListBucket',
          's3:GetObject',
          's3:PutObject',
          's3:DeleteObject',
          's3:GetObjectVersion',
        ],
        Resource: ['arn:aws:s3:::myBucket', arn:aws:s3:::myBucket/*'],
        Principal: '*',
      },
    ]
    rolePolicies.forEach((policy) => {
      vpcEndpoint.addToPolicy(iam.PolicyStatement.fromJson(policy))
    })

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmusingsoleView Question on Stackoverflow
Solution 1 - Amazon Web-ServicesLuis RMView Answer on Stackoverflow
Solution 2 - Amazon Web-ServicesGeoff View Answer on Stackoverflow
Solution 3 - Amazon Web-ServicesPaul SView Answer on Stackoverflow
Solution 4 - Amazon Web-ServicesmusingsoleView Answer on Stackoverflow
Solution 5 - Amazon Web-ServicesDan GravellView Answer on Stackoverflow
Solution 6 - Amazon Web-ServicesSubrata FouzdarView Answer on Stackoverflow
Solution 7 - Amazon Web-ServicesalayorView Answer on Stackoverflow