Learn how OpenAPI throttling works and how to manage your API rate limit quotas.
What is throttling
Throttling is the mechanism Alibaba Cloud uses to control how frequently OpenAPI operations are called. A quota defines the maximum number of calls allowed for a specific OpenAPI operation of an Alibaba Cloud product within a given time window.
Why you need traffic shaping
Ensure the stability of cloud services
Each cloud product supports a different maximum level of API concurrency. For products with lower concurrency limits, a high call volume from a single user can degrade response times or block other users from accessing the service.
Ensure OpenAPI service stability
Concentrated high-frequency calls to a single endpoint can overload the OpenAPI gateway. This affects all users sharing that endpoint—slowing responses across other cloud products and, in severe cases, making the endpoint completely unavailable.
Protect user assets
Incorrect or malicious OpenAPI calls can rapidly create large numbers of unwanted cloud resources. Throttling detects and blocks these requests, and triggers anomaly alerts to protect user assets.
Check throttling information
Find the throttling information for an API operation in its documentation page. To view quota details for all OpenAPI operations of a cloud product, go to Quota Center > API Rate Limits.
Examples:
Request Rate: 200/60 s — the API can be called up to 200 times per minute.
Request Rate: 5/1 s — the API can be called up to 5 times per second.
Request a quota increase
Alibaba Cloud assigns a default throttling quota to the OpenAPI of each cloud product. To request a higher quota, go to the Quota Center > API Rate Limits page.