Deployments, scaling, and restarts are routine operations for any online application. The graceful start feature in Microservices Engine (MSE) protects applications during each stage of startup. It includes three main capabilities: delayed service registration, low-traffic warm-up, and a service readiness probe. This topic describes the graceful start feature provided by MSE.
Feature overview
Delayed registration
A microservice provider instance registers itself with a registry during application startup. Once registration is complete, consumer applications can subscribe to and call the provider. For Java applications built on the Spring framework, this registration typically happens after the Spring context is refreshed. If the application has asynchronous initialization tasks that are not yet complete, registering the service immediately can cause request errors when consumers try to connect. For example, a MaxCompute application might need to pull several hundred megabytes of data from Object Storage Service (OSS) before it can serve requests. If the application registers itself immediately after startup, incoming traffic will fail because resources are not ready. The delayed service registration feature allows you to set a delay period, which postpones service registration. This ensures the application is fully initialized before registering with the registry and serving traffic, which prevents consumers from calling an unprepared provider.
Low-traffic warm-up
A newly started instance is often in a "cold state." In this "cold state," an instance must perform tasks such as lazy loading connection pools, pre-warming caches, and just-in-time (JIT) compilation of hot-spot code. Consequently, its request-handling capacity is much lower than that of a long-running instance. In a worst-case scenario, the service could hang, leading to a large number of request timeouts and errors.
The following example shows the difference in response time for two requests to an instance: one made before resources are fully loaded, and one made after. If a large number of requests arrive while the instance is still loading resources, they may all be blocked.
[arthas@37035]$ trace com.alibaba.mse.consumer.TestController eurekaRest -n 5 \
--skipJDKMethod false
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 105 ms, listenerId: 1
`---ts=2022-02-14 21:28:02;thread_name=http-nio-18099-exec-1;id=39;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@60e5272
`---[464.275852ms] com.alibaba.mse.consumer.TestController:eurekaRest()
`---[464.018509ms] org.springframework.web.client.RestTemplate:getForObject() #50
`---ts=2022-02-14 21:28:08;thread_name=http-nio-18099-exec-3;id=3b;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@60e5272
`---[8.46028ms] com.alibaba.mse.consumer.TestController:eurekaRest()
`---[8.402525ms] org.springframework.web.client.RestTemplate:getForObject() #50
The low-traffic warm-up feature works by controlling the traffic from consumer applications to a new service instance at launch. This prevents performance degradation from a cold start and protects the new instance from being overwhelmed by a sudden surge in traffic. Traffic to the instance gradually increases over time. When the configured warm-up duration is reached, the warm-up process ends, and the instance begins to receive traffic normally.
Low-traffic warm-up uses traffic from online consumers. This requires the service's consumers to also be connected to MSE Microservices Governance. For more information about how this feature works, see How low-traffic warm-up works.
Service readiness probe
Kubernetes provides a readiness probe mechanism. During a service deployment, once a new instance passes its readiness probe, the old instance is terminated (the exact behavior depends on the deployment strategy). However, Kubernetes cannot determine when a microservice is truly ready. It considers an application ready as soon as a port is open. This can lead Kubernetes to mark a new service as ready before it has successfully registered with the registry. Kubernetes then proceeds with the deployment and terminates the old, running instance. This can lead to errors for consumers, such as service no provider/instance.
The service readiness probe feature provides a non-intrusive HTTP endpoint via an agent that checks if the application has completed registration. The endpoint returns a 500 status code if registration is incomplete and a 200 status code after registration is successful. By configuring the application's readiness probe to use this endpoint, you can help Kubernetes accurately determine if the application is ready. This ensures that consumers always have an available provider during deployments in a Kubernetes environment, preventing "no provider" errors.
Using graceful start
Prerequisites
Usage notes
-
Graceful start is currently supported only for instances that use a microservice registry (such as Nacos) for service discovery. It is not supported for microservice instances that rely on Kubernetes Services for discovery.
-
For Spring Cloud applications, low-traffic warm-up is only supported for applications that use Nacos, ZooKeeper, or Eureka as the registry.
-
The low-traffic warm-up feature for Spring Cloud is implemented based on the default Spring Cloud load balancers:
ZoneAwareLoadBalancer,RoundRobinLoadBalancer, orRandomLoadBalancer. If you modify your application's load balancer configuration, this feature will not work. -
For low-traffic warm-up to work, both the provider and consumer applications must be connected to MSE Microservices Governance. This feature does not apply to applications like gateways that receive external traffic directly through exposed APIs.
Procedure
Step 1: Enable graceful start
Log on to the MSE console, and select a region in the top navigation bar.
In the left-side navigation pane, choose . On the page that appears, click the resource card of the application that you want to manage.
On the application details page, click Traffic management in the left-side navigation pane, and click the Graceful Start/Shutdown tab.
-
In the Configuration Information section, click **Edit**, enable the Graceful Start toggle, and then click OK.
Step 2: Configure Kubernetes readiness probe
Log on to the ACK console. In the left navigation pane, click Clusters.
-
On the Clusters page, click the target cluster. In the left-side navigation pane, choose Workload > Stateless. Find your deployed application and click Edit in the Actions column. In the Health Check section, click Enable next to Readiness and configure the following parameters. When you are finished, click Update.
-
Path: /readiness. (If your application uses an agent version earlier than 4.1.10, set the path to /health. To check the agent version, go to the MSE console. Navigate to Microservices Governance > Application Governance, click your application, and then select Node details. The agent version is displayed on the right.)
-
Port:
55199. -
Initial Delay (s): We recommend setting this value greater than the sum of the application startup time and the configured delayed registration duration (default is 0s). However, the feature still functions correctly even if you do not follow this recommendation.
-
For information about other parameters, see Create a stateless workload (Deployment). After the application restarts, the readiness probe will pass only after service registration is complete.
-
This operation will immediately restart your application. If you are in a production environment, perform this operation during a scheduled maintenance window.
(Optional) Configure delayed registration duration
This setting is optional and can be configured based on your business needs. For more details, see delayed service registration. Follow these steps:
-
Follow Step 1 and Step 2 to navigate to the graceful start page, enable the feature, and configure the Kubernetes service readiness probe.
-
Modify the Configuration Information for Graceful Start and Shutdown. Click the arrow to the left of the Graceful Start module to expand its settings. In the Delayed Registration Duration (s) field, enter a value, and then click OK.
The configured delayed registration duration takes effect the next time the application starts.
(Optional) Adjust low-traffic warm-up duration
Enabling graceful start automatically enables this feature with a default warm-up duration of 120 seconds. You can adjust this duration based on your business needs:
-
Follow Step 1 and Step 2 to navigate to the graceful start page, enable the feature, and configure the Kubernetes service readiness probe.
-
Modify the Configuration Information for Graceful Start and Shutdown. Click the arrow to the left of the Graceful Start module to expand its settings. Click Advanced Options. In the Low-traffic Warm-up Duration (s) field, enter a value, and then click OK.
-
If the consumer of the service being warmed up is an MSE cloud native gateway, the low-traffic warm-up configured here will not take effect. Instead, configure the warm-up settings in the MSE cloud native gateway. In the gateway console, click the target gateway instance. In the left-side navigation pane, choose Routes > Services. Find the service and, in the Actions column, click More > Policies. On the Policies tab, under Traffic Management > Load Balancing Configuration, click Edit and adjust the Warm-up Time setting. Note that the gateway's default warm-up QPS curve is linear, which is slightly different from the quadratic curve provided by MSE Microservices Governance, but the practical effect is similar.
-
The adjusted low-traffic warm-up duration will take effect the next time the application starts.
-
The low-traffic warm-up feature works on the consumer side by calculating weights for each provider instance based on their startup times. It then uses a load balancing algorithm to gradually increase traffic to a newly started application. This process helps warm up the service. This also requires the service's consumer to be connected to MSE Microservices Governance.
-
When using the low-traffic warm-up feature for the first time, we recommend using the default warm-up duration. If you observe that the warm-up is ineffective or causes traffic loss, you can then optimize by adjusting this parameter.
-
To ensure a sufficient warm-up, see Best practices for low-traffic warm-up.
Observing graceful start
After applying these configurations, the next time your application starts, you can view the specific start and shutdown times for your instance and its QPS curve on the Graceful Start and Shutdown page.
Log on to the MSE console, and select a region in the top navigation bar.
In the left-side navigation pane, choose . On the page that appears, click the resource card of the application that you want to manage.
On the application details page, click Traffic management in the left-side navigation pane, and click the Graceful Start/Shutdown tab.
-
On the Start and Shutdown Overview subtab, click an instance on the left. On the right, you can view the QPS changes and related events that occurred during the startup phase.

You will see events such as service registration, warm-up started, and warm-up ended occur in sequence. The 'Kubernetes readiness probe passed' event also occurs after the 'service registration' event. The QPS curve gradually increases to its maximum value during the warm-up duration (default 120s) instead of spiking suddenly. If the event sequence or the QPS curve shape does not meet your expectations during startup, see the FAQ for troubleshooting.
In the example shown in the figure, the application's Kubernetes readiness probe is configured to use the endpoint 55199/readiness, and its minimum ready time (minReadySeconds) is set to 120 seconds, matching the default warm-up duration.