How to process SQS queue with lambda function (not via scheduled events)?

Amazon Web-ServicesAws LambdaAmazon Sqs

Amazon Web-Services Problem Overview


Here is the simplified scheme I am trying to make work:

> http requests --> (Gateway API + lambda A) --> SQS --> (lambda B > ?????) --> DynamoDB

So it should work as shown: data coming from many http requests (up to 500 per second, for example) is placed into SQS queue by my lambda function A. Then the other function, B, processes the queue: reads up to 10 items (on some periodical basis) and writes them to DynamoDB with BatchWriteItem.

The problem is that I can't figure out how to trigger the second lambda function. It should be called frequently, multiple times per second (or at least once per second), because I need all the data from the queue to get into DynamoDB ASAP (that's why calling lambda function B via scheduled events as described [here][1] is not a option)


Why don't I want to write directly into DynamoDB, without SQS?

That would be great for me to avoid using SQS at all. The problem that I am trying to address with SQS is DynamoDB throttling. Not even throttling itself but the way it is handled while writing data to DynamoDB with AWS SDK: when writing records one by one and getting them throttled, AWS SDK silently retries writing, resulting in increasing of the request processing time from the http client's point of view.

So I would like to temporarily store data in the queue, send response "200 OK" back to client, and then get queue processed by separate function, writing multiple records with one DynamoDB's BatchWriteItem call (which returns Unprocessed items instead of automatic retry in case of throttling). I would even prefer to lose some records instead of increasing the lag between a record being received and stored in DynamoDB

UPD: If anyone is interested, I have found how to make aws-sdk skip automatic retries in case of throttling: there is a special parameter [maxRetries][2]. Anyway, going to use Kinesis as suggested below

[1]: http://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html "Using AWS Lambda with Scheduled Events" [2]: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Config.html#maxRetries-property

Amazon Web-Services Solutions


Solution 1 - Amazon Web-Services

[This doesn't directly answer your explicit question, so in my experience it will be downvoted :) However, I will answer the fundamental problem you are trying to solve.]

The way we take a flood of incoming requests and feed them to AWS Lambda functions for writing in a paced manner to DynamoDB is to replace SQS in the proposed architecture with Amazon Kinesis streams.

Kinesis streams can drive AWS Lambda functions.

Kinesis streams guarantee ordering of the delivered messages for any given key (nice for ordered database operations).

Kinesis streams let you specify how many AWS Lambda functions can be run in parallel (one per partition), which can be coordinated with your DynamoDB write capacity.

Kinesis streams can pass multiple available messages in one AWS Lambda function invocation, allowing for further optimization.

Note: It's really the AWS Lambda service that reads from Amazon Kinesis streams then invokes the function, and not Kinesis streams directly invoking AWS Lambda; but sometimes it's easier to visualize as Kinesis driving it. The result to the user is nearly the same.

Solution 2 - Amazon Web-Services

You can't do this directly integrating SQS and Lambda, unfortunately. But don't fret too much yet. There is a solution! You need to add another amazon service into the mix and all your problems will be solved.

http requests --> (Gateway API + lambda A) --> SQS + SNS --> lambda B --> DynamoDB

You can trigger an SNS notification to the second lambda service to kick it off. Once it is started, it can drain the queue and write all the results into DynamoDB. To better understand possible event sources for Lambda check out these docs.

Solution 3 - Amazon Web-Services

As of June 28, 2018, you can now use SQS to trigger AWS Lambda functions natively. A workarounds is no longer needed!

https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/

And in Nov 2019, support for FIFO queues was added:

https://aws.amazon.com/blogs/compute/new-for-aws-lambda-sqs-fifo-as-an-event-source/

Solution 4 - Amazon Web-Services

Another solution would be to just add the item to SQS, call the targeted Lambda function with Event so it is asynchronous.

The asynchronous Lambda can then get from SQS as many item as you want and process them.

I would also add a scheduled call to the asynchronous Lambda to handle any items in the queue that was in error.

[UPDATE] You can now setup Lambda trigger on new message on queue

Solution 5 - Amazon Web-Services

Maybe a more cost-efficient solution would be to keep everything in the SQS (as it is), then run a scheduled event that invokes a multi-threaded Lambda function that processes items from the queue?

This way, your queue worker can match your limits exactly. If the queue is empty, function can finish prematurely or start polling in single thread.

Kinesis sounds a like an over-kill for this case – you don't need the original order, for instance. Plus running multiple Lambdas simultaneously is surely more expensive than running just one multi-threaded Lambda.

Your Lambda will be all about I/O, making external calls to AWS services, so one function may fit very well.

Solution 6 - Amazon Web-Services

Here's how I collect messages from an SQS queue:

package au.com.redbarn.aws.lambda2lambda_via_sqs;

import java.util.List;

import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.SQSEvent;
import com.amazonaws.services.lambda.runtime.events.SQSEvent.SQSMessage;

import lombok.extern.log4j.Log4j2;

@Log4j2
public class SQSConsumerLambda implements RequestHandler<SQSEvent, String> {

    @Override
    public String handleRequest(SQSEvent input, Context context) {
	
	    log.info("message received");
	
	    List<SQSMessage> records = input.getRecords();
	
	    for (SQSMessage record : records) {
		    log.info(record.getBody());
	    }
	
	    return "Ok";
    }
}

Add your DynamoDB code to handleRequest() and Lambda B is done.

Solution 7 - Amazon Web-Services

Here's my solution to this problem:

HTTP request --> DynamoDb --> Stream --> Lambda Function

In this solution, you have to set up a stream for the table. The stream is handled with a Lambda function that you'll write and that's it. No need to use SQS or anything else.

Of course, this is a simplified design and it works only for simple problems. For more complicated scenarios, use Kinesis (as mentioned in the other answers).

Here's a link to AWS documentation on the topic.

Solution 8 - Amazon Web-Services

I believe AWS had now come up with a way where SQS can trigger a lambda function. So I guess we can use SQS for smoothening burst loads of data to dynamo incase you don't care about the order of messages. Check their blog on this new update: https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionxtxView Question on Stackoverflow
Solution 1 - Amazon Web-ServicesEric HammondView Answer on Stackoverflow
Solution 2 - Amazon Web-ServicesChris FranklinView Answer on Stackoverflow
Solution 3 - Amazon Web-ServicesTrentonView Answer on Stackoverflow
Solution 4 - Amazon Web-ServicesloopingzView Answer on Stackoverflow
Solution 5 - Amazon Web-ServicesDenis MysenkoView Answer on Stackoverflow
Solution 6 - Amazon Web-ServicesPeter SvehlaView Answer on Stackoverflow
Solution 7 - Amazon Web-ServicesMehranView Answer on Stackoverflow
Solution 8 - Amazon Web-ServicesDipayanView Answer on Stackoverflow