Throttling and quota management

更新时间:
复制 MD 格式

Learn how OpenAPI throttling works and how to manage your API rate limit quotas.

What is throttling

Throttling is the mechanism Alibaba Cloud uses to control how frequently OpenAPI operations are called. A quota defines the maximum number of calls allowed for a specific OpenAPI operation of an Alibaba Cloud product within a given time window.

Why you need traffic shaping

Ensure the stability of cloud services

Each cloud product supports a different maximum level of API concurrency. For products with lower concurrency limits, a high call volume from a single user can degrade response times or block other users from accessing the service.

Ensure OpenAPI service stability

Concentrated high-frequency calls to a single endpoint can overload the OpenAPI gateway. This affects all users sharing that endpoint—slowing responses across other cloud products and, in severe cases, making the endpoint completely unavailable.

Protect user assets

Incorrect or malicious OpenAPI calls can rapidly create large numbers of unwanted cloud resources. Throttling detects and blocks these requests, and triggers anomaly alerts to protect user assets.

Check throttling information

Find the throttling information for an API operation in its documentation page. To view quota details for all OpenAPI operations of a cloud product, go to Quota Center > API Rate Limits.

Note

Examples:

Request Rate: 200/60 s — the API can be called up to 200 times per minute.

Request Rate: 5/1 s — the API can be called up to 5 times per second.

Request a quota increase

Alibaba Cloud assigns a default throttling quota to the OpenAPI of each cloud product. To request a higher quota, go to the Quota Center > API Rate Limits page.