API Gateway Throttling: Protecting Your Backend Services
In your organisation, APIs serve as the front door to your business functionality. Without proper protections in place, a sudden surge in traffic—whether from legitimate users or malicious actors—can overwhelm your backend services, leading to poor performance, elevated costs, or even complete system failure.
Amazon API Gateway offers robust throttling capabilities that help you safeguard your backend services while maintaining a smooth experience for your users. Let's dive into how you can implement these protections effectively.
Why Throttling Matters
Before we explore the implementation details, understanding why API throttling is critical:
Backend Protection: Prevents your Lambda functions, EC2 instances, or other backend services from being overwhelmed
Cost Control: Limits excessive AWS resource consumption during traffic spikes
Abuse Prevention: Mitigates the impact of denial-of-service attacks
Fair Usage: Ensures resources are fairly distributed among all API consumers
API Gateway Throttling Layers
API Gateway implements throttling at two distinct levels:
1. Account-Level Throttling
By default, AWS applies throttling limits across your entire AWS account within each region:
Steady-state rate: 10,000 requests per second (RPS)
Burst capacity: 5,000 concurrent requests
These default limits can be increased by contacting AWS Support, but they serve as a baseline protection for your account.
2. Stage/Method-Level Throttling
This is where the real power lies—you can define custom throttling at various levels:
API stage (e.g., prod, dev)
Specific method (e.g., GET /products)
API key (for client-specific throttling)
Implementing Throttling in API Gateway
Setting Up Method-Level Throttling
When you need to limit requests to a specific API method:
Navigate to your API in the API Gateway console
Select the method you want to throttle
Under "Method Request", configure rate limits
Deploy your API for changes to take effect
Remember that method-level throttling takes precedence over stage-level settings.
Implementing Stage-Level Throttling
To apply consistent throttling across an entire API stage:
Go to the "Stages" section of your API
Select the target stage
Go to "Stage Settings"
Configure your throttling settings:
Rate: requests per second
Burst: maximum concurrent requests
Client-Specific Throttling with API Keys
For more granular control based on API consumers:
Create usage plans in API Gateway
Define throttling limits within each usage plan
Generate API keys for your clients
Associate API keys with usage plans
Require API key for your methods
This approach lets you offer different tiers of access to your API—premium clients might get higher limits than free-tier users.
What Are API Keys?
API keys are alphanumeric string values that you distribute to your API consumers. They serve multiple purposes:
Client identification: Recognize which client is making requests
Access control: Restrict API access to authenticated clients
Usage tracking: Monitor consumption patterns by client
Throttling enforcement: Apply client-specific rate limits
API keys are passed in the x-api-key
header with each request. While not a replacement for proper authentication mechanisms like OAuth or IAM, they provide a simple way to identify API consumers.
What Are Usage Plans?
Usage plans in API Gateway are a powerful feature that enables you to:
Define throttling limits for groups of API consumers
Set quotas for the maximum number of requests during a specified time period
Associate API stages and methods with the plan
Attach API keys to the plan
Think of usage plans as service tiers for your API—you might have a "Basic" plan with lower limits and a "Premium" plan with higher thresholds.
Practical Use Case: Multi-Tiered API for a Weather Service
Let's explore a practical example of when and how to implement API Gateway throttling.
Scenario
You've built a weather forecasting API that provides:
Current weather conditions
7-day forecasts
Historical weather data
Severe weather alerts
Your backend is powered by Lambda functions that query weather databases and third-party weather services. Your API has growing popularity, and you need to implement tiered access while protecting your backend.
Implementation
1. Define Your Service Tiers
Create three usage plans:
Free Tier:
Rate: 5 requests per second
Burst: 10 requests
Quota: 1,000 requests per day
Access to basic endpoints only
Standard Tier:
Rate: 15 requests per second
Burst: 20 requests
Quota: 10,000 requests per day
Access to all endpoints except real-time alerts
Premium Tier:
Rate: 50 requests per second
Burst: 100 requests
Quota: Unlimited requests
Access to all endpoints with priority handling
2. Protect Critical Endpoints
Apply method-level throttling to your most resource-intensive endpoint, the historical weather data:
Limit to 10 RPS regardless of tier to protect your database
Set burst capacity to 15 requests
3. Safeguard Third-Party Dependencies
For endpoints that call third-party weather services:
Implement stage-level throttling that aligns with your third-party API limits
This prevents cascading failures if the third-party experiences issues
4. Monitor and Adjust
Set up CloudWatch dashboards to monitor:
Overall API usage trends
Throttling events by endpoint and by client
Error rates and latency
Use this data to periodically adjust your throttling settings to match actual usage patterns.
5. Special Handling for Weather Emergencies
Implement higher throttling limits for the severe weather alerts endpoint during emergencies:
Use API Gateway stage variables with Lambda to dynamically adjust throttling
This ensures critical information flows during high-demand periods
Benefits Realized
By implementing this tiered throttling strategy:
Backend Protection: During severe weather events when traffic spikes, your backend services remain responsive
Business Model Support: Throttling enforces your tiered pricing strategy
Cost Control: Prevents free-tier users from consuming excessive resources
Fair Usage: Ensures premium customers always get the throughput they've paid for
Best Practices for API Throttling
Start Conservative
Begin with stricter throttling limits and gradually loosen them based on real-world usage patterns. It's easier to increase limits than to recover from a service outage.
Monitor Throttling Metrics
API Gateway automatically publishes metrics to CloudWatch:
Count
- Total number of requests4XXError
- Client errors (including 429 throttling responses)
Set up CloudWatch alarms on these metrics to be notified when your throttling is actively protecting your services.
Implement Client-Side Retry Logic
Advise your API consumers to implement exponential backoff retry strategies when they receive 429 "Too Many Requests" responses. This helps clients gracefully handle throttling events.
Consider Regional API Deployments
For high-traffic applications, deploying your API to multiple regions with Route 53 routing can distribute load and effectively increase your overall throughput.
When Throttling Occurs
When a request exceeds your defined throttling limits, API Gateway returns an HTTP 429 "Too Many Requests" error response. This happens without the request ever reaching your backend services, providing protection at the edge.
Conclusion
API Gateway's throttling capabilities offer an essential layer of protection for your backend services. By implementing thoughtful throttling strategies, you can ensure your applications remain responsive even during traffic spikes, protect yourself from malicious actors, and keep your AWS costs predictable.
Remember that throttling is just one aspect of a comprehensive API management strategy. Combine it with other API Gateway features like caching, request validation, and WAF integration for a robust API solution.
As your application evolves, regularly review your throttling configuration to ensure it aligns with your current business needs and usage patterns.