Health Checks Advanced
Implement sophisticated health checks for production monitoring.
A Simple Analogy
Health checks are like doctor visits for your application. Regular checks catch problems early, prevent bigger issues, and keep the system running smoothly.
Why Health Checks?
- Early detection: Catch issues before users see them
- Load balancing: Exclude unhealthy instances
- Orchestration: Kubernetes respects health
- Monitoring: Dashboard alerts on failures
- Debugging: Understand system state
Basic Implementation
// Program.cs
builder.Services.AddHealthChecks()
.AddCheck<DatabaseHealthCheck>("database")
.AddCheck<CacheHealthCheck>("cache");
var app = builder.Build();
app.MapHealthChecks("/health");
Custom Health Checks
public class DatabaseHealthCheck : IHealthCheck
{
private readonly IDbContextFactory<AppDbContext> _contextFactory;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
using var dbContext = await _contextFactory.CreateDbContextAsync(cancellationToken);
// Try to execute a simple query
var canConnect = await dbContext.Database.CanConnectAsync(cancellationToken);
if (canConnect)
return HealthCheckResult.Healthy("Database connection successful");
else
return HealthCheckResult.Unhealthy("Cannot connect to database");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy(
"Database health check failed",
exception: ex);
}
}
}
public class CacheHealthCheck : IHealthCheck
{
private readonly IDistributedCache _cache;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
await _cache.SetStringAsync("health-check", "ok", cancellationToken: cancellationToken);
var value = await _cache.GetStringAsync("health-check", cancellationToken);
return value == "ok"
? HealthCheckResult.Healthy("Cache is operational")
: HealthCheckResult.Unhealthy("Cache check failed");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Cache not available", exception: ex);
}
}
}
Detailed Health Response
// Program.cs with detailed output
var options = new HealthCheckOptions
{
ResponseWriter = WriteResponse
};
app.MapHealthChecks("/health", options);
static async Task WriteResponse(HttpContext context, HealthReport report)
{
context.Response.ContentType = "application/json";
var response = new
{
status = report.Status.ToString(),
checks = report.Entries.Select(entry => new
{
name = entry.Key,
status = entry.Value.Status.ToString(),
description = entry.Value.Description,
duration = entry.Value.Duration.TotalMilliseconds
})
};
var json = JsonSerializer.Serialize(response);
await context.Response.WriteAsync(json);
}
Liveness vs Readiness
builder.Services.AddHealthChecks()
// Liveness: Is app still running?
.AddCheck("liveness", () => HealthCheckResult.Healthy(), tags: new[] { "liveness" })
// Readiness: Can app handle requests?
.AddCheck<DatabaseHealthCheck>("database", tags: new[] { "readiness" })
.AddCheck<CacheHealthCheck>("cache", tags: new[] { "readiness" });
// Separate endpoints for Kubernetes
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = registration => registration.Tags.Contains("liveness")
});
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = registration => registration.Tags.Contains("readiness")
});
Kubernetes Integration
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
template:
spec:
containers:
- name: api
image: myapi:latest
# Startup check
startupProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30
# Liveness check (restart if fails)
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
# Readiness check (remove from service if fails)
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
Best Practices
- Include critical checks: Database, cache, external APIs
- Set appropriate timeouts: Don't timeout unexpectedly
- Separate concerns: Liveness vs readiness
- Exclude transient failures: Retry before reporting unhealthy
- Monitor check performance: Health checks shouldn't slow app
Related Concepts
- Metrics and monitoring
- Alerting strategies
- Graceful degradation
- Circuit breakers
Summary
Health checks enable proactive monitoring and automatic remediation. Implement comprehensive checks that give your orchestration platform visibility into application state.
Related Articles
OpenTelemetry for Observability
Implement comprehensive observability with OpenTelemetry.
Read More devopsHealth Checks for Services
Implement health checks to monitor service availability.
Read More webAPI Documentation with Swagger
Generate interactive API documentation with Swagger/OpenAPI.
Read More