In a stable production environment, container scheduling is fully managed by k8s, and microservice governance is maintained and managed by the service framework or operations personnel. In scenarios like releasing a new version or scaling up/down, old container instances will be terminated and replaced with new ones. If this replacement process is not handled properly in high-traffic online production environments, it can lead to a large number of erroneous requests in a short time, triggering alarms and even affecting normal business operations. For larger organizations, the losses from issues during the release process can be enormous. Hence, the need for graceful shutdown has arisen. This requires the service framework to provide stable guarantees during the service offline process on top of stable service invocation and traditional service governance capabilities, thus reducing operational costs and improving application stability.
In a complete RPC call process, intermediate services often act as both service providers and consumers. After receiving a request from the upstream service, the intermediate service processes the request and returns the result to the upstream service, and then calls the downstream service’s interface as needed. Therefore, the graceful shutdown function needs to ensure stability on both the service provider and consumer sides, which can be broken down into the following steps:
By following these steps, it ensures that the dubbo-go service instance stops safely and smoothly, without impacting ongoing business.
Note: Canceling the subscription to the registry cannot be performed in step 1, as changes to downstream service information may occur when the intermediate service sends requests to the downstream service.
Stop the dubbo-go instance using the kill pid
command.
The following are configurable settings that users can customize in the yaml configuration file:
consumer-update-wait-time
field in the configuration, defaulting to 3s.step-timeout
field, defaulting to 3s.offline-request-window-timeout
field, defaulting to 0s.internal-signal
field in the configuration, which is enabled by default.timeout
field, defaulting to 60s.dubbo:
shutdown:
timeout:60
step-timeout:3
consumer-update-wait-time:3
internal-signal:true
offline-request-window-timeout:0
Additionally, if users wish to execute custom callback operations after the offline logic is completely finished, they can use the following code:
extension.AddCustomShutdownCallback(func() {
// User defined operations
})