Deploying an update of any application can be risky, because new code may contain new bugs. Unit testing is an advisable method of reducing the risk. However, some mechanisms depend on workload. Some workloads can be simulated easier than others. Service Fabric provides health monitoring after the new application version is deployed to the cluster. If the new version is not healthy the old version is rolled back automatically. Setting up the protection against failures caused by upgrades is relatively easy.
Create a new Service Fabric Stateful service, open the Stateful1.cs class and replace its content with the following code:
using Microsoft.ServiceFabric.Services.Communication.Runtime;
using Microsoft.ServiceFabric.Services.Runtime;
using System;
using System.Collections.Generic;
using System.Fabric;
using System.Fabric.Health;
using System.Threading;
using System.Threading.Tasks;
namespace Stateful1 {
internal sealed class Stateful1 : StatefulService {
public Stateful1(StatefulServiceContext context) : base(context) { }
protected override IEnumerable<ServiceReplicaListener> CreateServiceReplicaListeners() {
return new ServiceReplicaListener[0];
}
protected override async Task RunAsync(CancellationToken cancellationToken) {
var version = Context.CodePackageActivationContext.GetServiceManifestVersion();
ServiceEventSource.Current.ServiceMessage(Context, $"version: {version}", Context.ServiceName);
while (!cancellationToken.IsCancellationRequested) {
if (version == "1.0.0") {
var healthInformation = new HealthInformation(nameof(Stateful1), "Watchdog", HealthState.Ok) {
TimeToLive = TimeSpan.FromMinutes(1)
};
FabricRuntime.GetActivationContext().ReportDeployedServicePackageHealth(healthInformation);
await Task.Delay(TimeSpan.FromSeconds(10), cancellationToken);
} else {
var healthInformation = new HealthInformation(nameof(Stateful1), "Watchdog", HealthState.Ok) {
TimeToLive = TimeSpan.FromSeconds(10)
};
FabricRuntime.GetActivationContext().ReportDeployedServicePackageHealth(healthInformation);
await Task.Delay(TimeSpan.FromSeconds(30), cancellationToken);
}
}
}
}
}
As you can see, there is a HealthInformation class. It reports a health state of some property. The health of entire service consists of multiple properties. The health information can be valid until it is rewritten, or it can be periodically ensured as valid. It this case, the TimeToLive interval must be set. When the interval will expire and new health information is not present, the health state will be automatically changed to Error.
The code above simulates two versions of the same service. In the first version, the unit of work is done sooner than health information expires. Then the loop starts again, the health information is refreshed and whole cycle starts again.
In the newer version, the unit of work takes longer so that health information expires sooner than the work is done. It simulates unexpected decrease of the service performance. The service will be unhealthy for the most of the time and Service Fabric can detect it and halt the upgrade.
Publish the application as usual and then update its version.
- In the ApplicationManifest.xml file update the ApplicationTypeVersion attribute of the ApplicationManifest element.
- In the ServiceManifest.xml file in the update the Version attribute of the ServiceManifest element.
- In the ServiceManifest.xml file in the update the Version attribute of the CodePackage element.
Publish the application again, but modify the settings. Check the Upgrade the Application option.
Click on the Configure Upgrade Settings link and set the Monitored Upgrade mode. Verify that the FailureAction property is set to the Rollback value.
Click to the Publish button and open Service Fabric Explorer. You can see that one upgrade is in progress.
The upgrade is processed the health of the service is monitored.
When the service is unhealthy after the upgrade, it is downgraded to the original version.
Health monitoring can reflect the quality of the service and block the upgrade if the quality of the service decreases.