Health Check API
The Health Check API provides operational endpoints for monitoring service availability and health status. These endpoints are designed for use with container orchestration platforms (Kubernetes, Docker Swarm) and monitoring systems.
Endpoints Summary
| Method | Path | Description |
|---|---|---|
GET |
/health |
Basic health check with version info |
GET |
/health/live |
Liveness probe (server responsive) |
GET |
/health/ready |
Readiness probe (database connectivity) |
GET |
/metrics |
Prometheus metrics endpoint |
GET /health
Basic health check that confirms the service process is running.
Request
curl http://localhost:8080/health
Response
Status: 200 OK
{
"status": "ok",
"version": "0.1.0"
}
Response Fields
| Field | Type | Description |
|---|---|---|
status |
string | Always returns “ok” when service is running |
version |
string | Current application version from Cargo.toml |
GET /health/live
Liveness probe endpoint that performs a minimal check to confirm the server process is responsive. This endpoint is intended for container orchestration systems to detect if the application needs to be restarted.
Request
curl http://localhost:8080/health/live
Response
Status: 200 OK
No response body. Returns only HTTP status code.
Usage
Configure Kubernetes liveness probe:
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 10
periodSeconds: 30
GET /health/ready
Readiness probe endpoint that verifies the service is ready to accept traffic by checking database connectivity. This endpoint executes a SELECT 1 query against the database to confirm the connection is active.
Request
curl http://localhost:8080/health/ready
Response - Healthy
Status: 200 OK
{
"status": "ready",
"version": "0.1.0",
"database": "connected"
}
Response - Unhealthy
Status: 503 Service Unavailable
{
"status": "unhealthy",
"version": "0.1.0",
"database": "disconnected"
}
Response Fields
| Field | Type | Description |
|---|---|---|
status |
string | “ready” when healthy, “unhealthy” when database is disconnected |
version |
string | Current application version from Cargo.toml |
database |
string | “connected” or “disconnected” based on database query result |
Usage
Configure Kubernetes readiness probe:
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
GET /metrics
Prometheus metrics endpoint that exposes operational metrics in Prometheus format.
Request
curl http://localhost:8080/metrics
Response
Status: 200 OK
Content-Type: text/plain; version=0.0.4
Returns metrics in Prometheus text exposition format:
# HELP sso_http_request_duration_seconds HTTP request duration in seconds by method, route pattern, and status class
# TYPE sso_http_request_duration_seconds histogram
sso_http_request_duration_seconds_bucket{method="GET",route="/api/user",status="2xx",le="0.005"} 142
sso_http_request_duration_seconds_bucket{method="GET",route="/api/user",status="2xx",le="0.01"} 287
...
# HELP sso_db_pool_connections_total Current number of connections in the database pool
# TYPE sso_db_pool_connections_total gauge
sso_db_pool_connections_total{backend="sqlite"} 5
# HELP sso_job_queue_depth Number of pending jobs in the system job queue
# TYPE sso_job_queue_depth gauge
sso_job_queue_depth 0
Available Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
sso_http_request_duration_seconds |
Histogram | method, route, status |
HTTP request latency |
sso_db_pool_connections_total |
Gauge | backend |
Current pool connections |
sso_db_pool_connections_idle |
Gauge | backend |
Idle pool connections |
sso_db_pool_connections_max |
Gauge | backend |
Max configured connections |
sso_job_queue_depth |
Gauge | - | Pending background jobs |
sso_job_processing_duration_seconds |
Histogram | job_type |
Job execution latency |
sso_webhook_delivery_latency_seconds |
Histogram | - | Webhook delivery time |
sso_active_users_total |
Gauge | - | Total active users |
sso_total_organizations |
Gauge | - | Total organizations |
sso_mfa_enabled_users_total |
Gauge | - | Users with MFA enabled |
sso_mfa_adoption_percentage |
Gauge | - | MFA adoption rate |
sso_login_failures_total |
Counter | reason |
Failed login attempts |
sso_auth_tokens_issued_total |
Counter | - | Tokens issued |
Prometheus Configuration
Add to your prometheus.yml:
scrape_configs:
- job_name: 'sso'
static_configs:
- targets: ['sso-server:8080']
scrape_interval: 15s
metrics_path: /metrics
Example PromQL Queries
# P95 HTTP request latency
histogram_quantile(0.95, rate(sso_http_request_duration_seconds_bucket[5m]))
# Request rate by route
rate(sso_http_request_duration_seconds_count[5m])
# Connection pool utilization
sso_db_pool_connections_total / sso_db_pool_connections_max
# Background job backpressure
sso_job_queue_depth > 100
Implementation Details
- Authentication: None required. All health and metrics endpoints are publicly accessible.
- Rate Limiting: Health endpoints are not rate-limited.
- Database Check: The
/health/readyendpoint executesSELECT 1to verify database connectivity. This is a lightweight query that works across SQLite, PostgreSQL, and MySQL. - Response Time: All endpoints are designed for fast responses (<100ms under normal conditions).
- Metrics Update: Gauge metrics (users, organizations, pool stats) are updated every 30 seconds via background task.