Health Check API

Operational health check endpoints for monitoring service availability, liveness probes, and database readiness checks.

Updated Dec 13, 2025
Edit on GitHub
health monitoring operations kubernetes

Health Check API

The Health Check API provides operational endpoints for monitoring service availability and health status. These endpoints are designed for use with container orchestration platforms (Kubernetes, Docker Swarm) and monitoring systems.

Endpoints Summary

Method Path Description
GET /health Basic health check with version info
GET /health/live Liveness probe (server responsive)
GET /health/ready Readiness probe (database connectivity)
GET /metrics Prometheus metrics endpoint

GET /health

Basic health check that confirms the service process is running.

Request

curl http://localhost:8080/health

Response

Status: 200 OK

{
  "status": "ok",
  "version": "0.1.0"
}

Response Fields

Field Type Description
status string Always returns “ok” when service is running
version string Current application version from Cargo.toml

GET /health/live

Liveness probe endpoint that performs a minimal check to confirm the server process is responsive. This endpoint is intended for container orchestration systems to detect if the application needs to be restarted.

Request

curl http://localhost:8080/health/live

Response

Status: 200 OK

No response body. Returns only HTTP status code.

Usage

Configure Kubernetes liveness probe:

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 30

GET /health/ready

Readiness probe endpoint that verifies the service is ready to accept traffic by checking database connectivity. This endpoint executes a SELECT 1 query against the database to confirm the connection is active.

Request

curl http://localhost:8080/health/ready

Response - Healthy

Status: 200 OK

{
  "status": "ready",
  "version": "0.1.0",
  "database": "connected"
}

Response - Unhealthy

Status: 503 Service Unavailable

{
  "status": "unhealthy",
  "version": "0.1.0",
  "database": "disconnected"
}

Response Fields

Field Type Description
status string “ready” when healthy, “unhealthy” when database is disconnected
version string Current application version from Cargo.toml
database string “connected” or “disconnected” based on database query result

Usage

Configure Kubernetes readiness probe:

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10

GET /metrics

Prometheus metrics endpoint that exposes operational metrics in Prometheus format.

Request

curl http://localhost:8080/metrics

Response

Status: 200 OK

Content-Type: text/plain; version=0.0.4

Returns metrics in Prometheus text exposition format:

# HELP sso_http_request_duration_seconds HTTP request duration in seconds by method, route pattern, and status class
# TYPE sso_http_request_duration_seconds histogram
sso_http_request_duration_seconds_bucket{method="GET",route="/api/user",status="2xx",le="0.005"} 142
sso_http_request_duration_seconds_bucket{method="GET",route="/api/user",status="2xx",le="0.01"} 287
...

# HELP sso_db_pool_connections_total Current number of connections in the database pool
# TYPE sso_db_pool_connections_total gauge
sso_db_pool_connections_total{backend="sqlite"} 5

# HELP sso_job_queue_depth Number of pending jobs in the system job queue
# TYPE sso_job_queue_depth gauge
sso_job_queue_depth 0

Available Metrics

Metric Type Labels Description
sso_http_request_duration_seconds Histogram method, route, status HTTP request latency
sso_db_pool_connections_total Gauge backend Current pool connections
sso_db_pool_connections_idle Gauge backend Idle pool connections
sso_db_pool_connections_max Gauge backend Max configured connections
sso_job_queue_depth Gauge - Pending background jobs
sso_job_processing_duration_seconds Histogram job_type Job execution latency
sso_webhook_delivery_latency_seconds Histogram - Webhook delivery time
sso_active_users_total Gauge - Total active users
sso_total_organizations Gauge - Total organizations
sso_mfa_enabled_users_total Gauge - Users with MFA enabled
sso_mfa_adoption_percentage Gauge - MFA adoption rate
sso_login_failures_total Counter reason Failed login attempts
sso_auth_tokens_issued_total Counter - Tokens issued

Prometheus Configuration

Add to your prometheus.yml:

scrape_configs:
  - job_name: 'sso'
    static_configs:
      - targets: ['sso-server:8080']
    scrape_interval: 15s
    metrics_path: /metrics

Example PromQL Queries

# P95 HTTP request latency
histogram_quantile(0.95, rate(sso_http_request_duration_seconds_bucket[5m]))

# Request rate by route
rate(sso_http_request_duration_seconds_count[5m])

# Connection pool utilization
sso_db_pool_connections_total / sso_db_pool_connections_max

# Background job backpressure
sso_job_queue_depth > 100

Implementation Details

  • Authentication: None required. All health and metrics endpoints are publicly accessible.
  • Rate Limiting: Health endpoints are not rate-limited.
  • Database Check: The /health/ready endpoint executes SELECT 1 to verify database connectivity. This is a lightweight query that works across SQLite, PostgreSQL, and MySQL.
  • Response Time: All endpoints are designed for fast responses (<100ms under normal conditions).
  • Metrics Update: Gauge metrics (users, organizations, pool stats) are updated every 30 seconds via background task.