AWS API Performance Comparison: Serverless vs. Containers vs. API Gateway integration

In my last post, I showed how to connect AWS API Gateway directly to SNS using a service integration.

A few people asked me about the performance implications of this architecture.

Is it significantly faster than using a Lambda-based approach?

How does it compare to EC2 or ECS?

My answer: I don't know! But I know how to find out (sort of).

In this post, we do a performance bake-off of three ways to deploy the same HTTP endpoint in AWS:

Using an API Gateway service proxy
With the new hotness, AWS Lambda
With the old hotness, Docker containers on AWS Fargate

We'll deploy our three services and throw 15,000 requests at each of them. Who will win?

If you're impatient, skip here to see full results

Table of Contents:

Background
Performance results
Conclusion

Background

Before we review the results, let's set up the problem.

I wanted to keep our example as simple as possible so that the comparison is limited to the architecture itself rather than the application code. Further, I wanted an example that would work with the API Gateway service proxy so we could use it as a comparison as well.

I decided to set up a simple endpoint that receives an HTTP POST request and forwards the request payload into an AWS SNS topic.

Let's take a look at the architecture and deployment methods for each of our three approaches.

Go Serverless with AWS Lambda

The first approach is to use AWS API Gateway and AWS Lambda. Our architecture will look like this:

SNS Publish with Lambda

A user will make an HTTP POST request to our endpoint, which will be handled by API Gateway. API Gateway will forward the request to our AWS Lambda function for processing. The Lambda function will send our request payload to the SNS topic before returning a response.

If you want to deploy this example, the code is available here. I use the Serverless Framework for deploying the architecture because I think it's the easiest way to do it.*

*Full disclosure: I work for Serverless, Inc., creators of the Serverless Framework. Want to come work with me on awesome stuff? We're hiring engineers. Please reach out if you have any interest.

Skipping the middleman with API Gateway service proxy

The second approach is similar to the first, but we remove Lambda from the equation. We use an API Gateway service proxy integration to publish directly to our SNS topic from API Gateway:

APIG Service Proxy

Before doing any testing, my hunch is that this will be faster than the previous method since we're cutting out a network hop in the middle. Check below for full results. Note that API Gateway service proxies won't work for all parts of your infrastructure, even if the performance is faster.

If you want additional details on how, when, and why to use this, check out my earlier post on using an API Gateway service proxy integration. It does a step-by-step walkthrough of setting up your first service proxy.

To deploy this example, there is a CloudFormation template here. This will let you quickly spin up the stack for testing.

Containerizing your workload with Docker and AWS Fargate

The final approach is to run our compute in Docker containers. There are a few different approaches for doing this on AWS, but I chose to use AWS Fargate.

The architecture will look as follows:

Fargate to SNS

Users will make HTTP POST requests to an HTTP endpoint, which will be handled by an Application Load Balancer (ALB). This ALB will forward requests to our Fargate container instances. The application on our Fargate container instances will forward the request payload to SNS.

With Fargate, you can run tasks or services. A task is a one-off container that will run until it dies or finishes execution. A service is a defined set of a certain number of instances of a task. Fargate will ensure the correct number of instances of your service are running.

We'll use a service so that we can run a sufficient number of instances. Further, you can easily set up a load balancer for managing HTTP traffic across your service instances.

You can find code and instructions for deploying this architecture to Fargate here. I use the incredible fargate CLI tool, which makes it dead simple to go from Dockerfile to running container.

Now that we know our architecture, let's jump into the bakeoff!

Performance results

After I deployed all three of the architectures, I wanted to do testing in two phases.

First, I ran a small sample of 2000 requests to check the performance of new deploys. This was running at around 40 requests per second.

Then, I ran a larger test of 15000 requests to see how each architecture performed when they are warmed up. For this larger test, I was sending around 100 requests per second.

Let's check the results in order.

When I ran my initial Fargate warmup, I got the following results:

Around 10% of my requests were failing altogether!

When I dug in, it looked like I was overwhelming my container instances, causing them to die.

I'm not a Docker or Flask performance expert, and that's not the goal of this exercise. To remedy this, I decided to bump the specs on my deployments.

The general goal for this bakeoff is to get a best-case outcome for each of these architectures, rather than an apples-to-apples comparison of cost vs performance.

For Fargate, this meant deploying 50 instances of my container with pretty beefy settings -- 8 GB of memory and 4 full CPU units per container instance.

For the Lambda service, I set memory to the maximum of 3GB.

For APIG service proxy, there are no knobs to tune. 🎉

With that out of the way, let's check the initial results.

Initial warmup results

For the first 2000 requests to each type of endpoint, the performance results are as follows:

api performance results -- warmup

Note: Chart using a log scale

The raw data for the results are:

Endpoint type	# requests	50%	66%	75%	80%	90%	95%	98%	99%	100%
APIG Service Proxy	2051	80	90	110	120	150	190	220	250	520
AWS Lambda	2084	94	100	110	120	150	180	210	290	5100
Fargate	2047	68	73	76	80	110	110	130	140	550

Takeaways from the warmup test

Fargate was consistently the fastest across all percentiles.
AWS Lambda had the longest tail on all of them. This is due to the cold start problem.
API Gateway service proxy outperformed AWS Lambda at the median, but performance in the upper-middle of the range (75% - 99%) was pretty similar between the two.

Now that we've done our warmup test, let's check out the results from the full performance test.

Full performance test results

For the main part of the performance test, I ran 15,000 requests at each of the three architectures. I planned to use 500 'users' in Locust to accomplish this, though, as noted below, I had to make some modifications for Fargate.

First, let's check the results:

api performance results -- full test

Note: Chart using a log scale

The raw data for the results are:

Endpoint type	# requests	50%	66%	75%	80%	90%	95%	98%	99%	100%
APIG Service Proxy	15185	73	79	84	90	130	180	250	290	670
AWS Lambda	15249	86	92	98	110	140	160	180	220	920
Fargate	15057	69	72	75	77	91	110	130	170	800

Takeaways from the full performance test

Fargate was still the fastest across the board, though the gap narrowed. API Gateway service proxy was nearly as fast as Fargate at the median, and AWS Lambda wasn't far behind.
The real differences show up between the 80th and 99th percentile. Fargate had a lot more consistent performance as it moved up the percentiles. The 98th percentile request for Fargate is less than double the median (130ms vs 69ms, respectively). In contrast, the 98th percentile for API Gateway service proxy was more than triple the median (250ms vs 73ms, respectively).
AWS Lambda outperformed the API Gateway service proxy at some higher percentiles. Between the 95th and 99th percentiles, AWS Lambda was actually faster than the API Gateway service proxy. This was surprising to me.

I mentioned above that I wanted to use 500 Locust 'users' when testing the application. Both AWS Lambda and API Gateway service proxy handled 15000+ requests without a single error.

With Fargate, I consistently had failed requests:

I finally throttled it down to 200 Locust users when testing for Fargate, which got my error rate down to around 3% of overall requests. Still, this was infinitely higher than the error with AWS Lambda.

I'm not saying you can't deploy a Fargate service without tolerating a certain percentage of failures. Rather, performance tuning Docker containers was more time than I wanted to spend on a quick performance test.

UPDATED NOTES ON FARGATE ERRORS

I've gotten some pushback saying that the test is worthless due to the Fargate errors, or that I was way over-provisioned on Fargate.

¯\_(ツ)_/¯

A few notes on that:

First, Nathan Peck, an awesome and helpful container advocate at AWS, reached out to say the failures were likely around some system settings like the 'nofile' ulimit.

That sounds pretty reasonable to me, but I haven't taken the time to test it out. I don't have huge interest in digging deep into container performance tuning for this. If that's something you're into, let me know and I'll link to your results if they're interesting!

The key points on Fargate are:

You can get much lower failure rates than I got. You'll just need to tune it.
I didn't use 50 instances with a ton of CPU and memory because I thought Fargate needed it. I used it because I didn't want to think about resource exhaustion at all (even though I did end up hitting the open file limits). I was going for a best-case scenario -- if the load balancer, container, and SNS are all humming, what kind of latency can we get?
I don't think this invalidates the general results of what a basic 'optimistic-case' could look like with Fargate within these general constraints (multiple instances + Python + calling SNS).

If you're making a million dollar decision on this, you should run your own tests.

If you want a quick, fun read, these results should be directionally correct.

Conclusion

This was a fun and enlightening experience for me, and I hope it was helpful for you. There's not a clear right answer on which architecture you should use based on these performance results.

Here's how I think about it:

Do you need high performance? Using dedicated instances with Fargate (or ECS/EKS/EC2) is your best best. This will require more setup and infrastructure management, but that may be necessary for your use case.
Is your business logic limited? If so, use API Gateway service proxy. API Gateway service proxy is a performant, low-maintenance way to stand up endpoints and forward data to another AWS service.
In the vast number of other situations, use AWS Lambda. Lambda is dead-simple to deploy (if you're using a deployment tool). It's reliable and scalable. You don't have to worry about tuning a bunch of knobs to get solid performance. And it's code, so you can do anything you want. I use it for almost everything.

AWS API Performance Comparison: Serverless vs. Containers vs. API Gateway integration