Deployment
Production Checklist
New to acton-service?
Start with the homepage to understand what acton-service is, then explore Core Concepts for foundational explanations. See the Glossary for technical term definitions.
This checklist covers security, observability, performance, and operational considerations for acton-service deployments.
Configuration Review
Service Configuration
- [ ] Set explicit service name in configuration
- [ ] Configure appropriate port binding (default: 8080)
- [ ] Enable TLS/HTTPS for production traffic
- [ ] Set connection timeout values appropriate for workload
- [ ] Configure graceful shutdown timeout
- [ ] Review and set request body size limits
- [ ] Enable compression for responses (gzip, br, deflate, zstd)
Environment Variables
- [ ] Use environment-specific configuration files
- [ ] Validate all required environment variables are set
- [ ] Store sensitive values in secure secret management
- [ ] Never commit secrets to version control
- [ ] Use XDG-compliant config directories (
~/.config/acton-service/) - [ ] Document all required configuration values
Feature Flags
- [ ] Enable only required features to minimize dependencies
- [ ] Review enabled middleware and their performance impact
- [ ] Disable development-only features (verbose logging, debug endpoints)
- [ ] Configure observability features (
observabilityflag)
Security Hardening
Authentication & Authorization
- [ ] Implement JWT authentication for protected endpoints
- [ ] Use strong signing algorithms (RS256, ES256 preferred over HS256)
- [ ] Rotate JWT signing keys regularly
- [ ] Set appropriate token expiration times
- [ ] Implement token revocation with Redis backend
- [ ] Configure Cedar policy-based authorization for fine-grained access
- [ ] Review and test all Cedar policies before deployment
- [ ] Validate JWT claims structure (roles, permissions, user/client ID)
Network Security
- [ ] Enable CORS with restrictive origins (not
*in production) - [ ] Configure TLS with modern cipher suites
- [ ] Use HTTPS-only in production
- [ ] Implement rate limiting (per-user and per-client)
- [ ] Set up distributed rate limiting with Redis for multi-instance deployments
- [ ] Configure firewall rules to restrict service access
Input Validation
- [ ] Set maximum request body sizes
- [ ] Validate all user inputs
- [ ] Implement request timeouts to prevent slow clients
- [ ] Enable panic recovery middleware
- [ ] Configure sensitive header masking in logs
Secrets Management
- [ ] Never hardcode credentials in code or config files
- [ ] Use environment variables or secret managers (Vault, AWS Secrets Manager)
- [ ] Rotate database credentials regularly
- [ ] Secure Redis connection strings
- [ ] Protect NATS authentication tokens
- [ ] Review file permissions on configuration files
Health Check Validation
Kubernetes Probes
- [ ] Test
/healthendpoint returns 200 when service is healthy - [ ] Test
/readyendpoint validates all dependencies - [ ] Configure appropriate
initialDelaySecondsfor startup time - [ ] Set
periodSecondsto balance responsiveness and overhead - [ ] Configure
failureThresholdto prevent flapping - [ ] Test liveness probe triggers pod restart on failure
- [ ] Verify readiness probe removes unhealthy pods from load balancer
Dependency Health
- [ ] Verify database connectivity in readiness check
- [ ] Verify Redis connectivity in readiness check
- [ ] Verify NATS connectivity in readiness check
- [ ] Configure appropriate health check timeouts
- [ ] Test service behavior when dependencies are unavailable
Observability Setup
Logging
- [ ] Enable structured JSON logging for production
- [ ] Set appropriate log level (
infoorwarnfor production) - [ ] Configure log aggregation (ELK, Loki, CloudWatch)
- [ ] Verify sensitive data is masked in logs
- [ ] Enable request ID generation and propagation
- [ ] Configure correlation IDs for distributed tracing
Metrics
- [ ] Enable OpenTelemetry metrics collection
- [ ] Configure Prometheus scraping endpoint (
/metrics) - [ ] Set up ServiceMonitor for Kubernetes deployments
- [ ] Monitor HTTP request count and duration
- [ ] Monitor active request count
- [ ] Monitor request/response sizes
- [ ] Set up alerting on key metrics (error rate, latency, throughput)
Distributed Tracing
- [ ] Enable OpenTelemetry tracing
- [ ] Configure trace exporter (Jaeger, Zipkin, OTLP)
- [ ] Propagate trace headers (x-request-id, x-trace-id, x-span-id)
- [ ] Sample traces appropriately for production load
- [ ] Verify trace context propagation across services
Alerting
- [ ] Set up alerts for high error rates (>1%)
- [ ] Set up alerts for high latency (p99 > threshold)
- [ ] Set up alerts for pod restart loops
- [ ] Set up alerts for resource exhaustion
- [ ] Set up alerts for circuit breaker trips
- [ ] Configure PagerDuty/Opsgenie for critical alerts
Performance Optimization
Resource Configuration
- [ ] Set appropriate memory requests and limits
- [ ] Set appropriate CPU requests and limits
- [ ] Configure connection pool sizes for database
- [ ] Configure connection pool sizes for Redis
- [ ] Configure NATS connection pool settings
- [ ] Tune request timeout values
Resilience Patterns
- [ ] Enable circuit breaker middleware
- [ ] Configure failure rate threshold (recommended: 0.5 = 50%)
- [ ] Enable retry logic with exponential backoff
- [ ] Set maximum retry attempts (recommended: 3)
- [ ] Enable bulkhead pattern for concurrency limiting
- [ ] Set appropriate concurrent request limits
- [ ] Test circuit breaker behavior under failure
Caching
- [ ] Configure Redis caching for Cedar policy decisions
- [ ] Set appropriate cache TTLs
- [ ] Implement cache warming for critical data
- [ ] Monitor cache hit rates
- [ ] Configure cache eviction policies
Database Optimization
- [ ] Enable connection pooling with appropriate size
- [ ] Set connection timeout and idle timeout
- [ ] Create database indexes for frequent queries
- [ ] Test query performance under load
- [ ] Enable prepared statement caching
Deployment Strategy
Container Configuration
- [ ] Use multi-stage Docker builds
- [ ] Run containers as non-root user
- [ ] Use minimal base images (debian:bookworm-slim)
- [ ] Install only required dependencies
- [ ] Add Docker health checks
- [ ] Scan images for vulnerabilities (Docker Scout, Trivy)
- [ ] Use specific version tags (not
latest)
Kubernetes Configuration
- [ ] Deploy with 3+ replicas for high availability
- [ ] Configure rolling update strategy
- [ ] Set
maxSurge: 1andmaxUnavailable: 0for zero-downtime - [ ] Enable horizontal pod autoscaling (HPA)
- [ ] Configure pod disruption budget (PDB)
- [ ] Set resource requests and limits
- [ ] Use dedicated namespaces per environment
API Versioning
- [ ] All routes use
VersionedApiBuilder - [ ] Version deprecated APIs with deprecation warnings
- [ ] Document API version lifecycle
- [ ] Plan for version sunset timeline
- [ ] Test backward compatibility
- [ ] Provide migration guides for version changes
Pre-Deployment Testing
Functional Testing
- [ ] All unit tests passing
- [ ] All integration tests passing
- [ ] End-to-end tests covering critical paths
- [ ] Test all API versions
- [ ] Test health and readiness endpoints
- [ ] Test graceful shutdown behavior
Load Testing
- [ ] Perform load testing at expected production traffic levels
- [ ] Test at 2x expected peak traffic
- [ ] Verify latency under load (p50, p95, p99)
- [ ] Test circuit breaker activation under failure
- [ ] Test rate limiting behavior
- [ ] Monitor resource usage under load
Security Testing
- [ ] Run security scanner on dependencies (
cargo audit) - [ ] Test JWT authentication with invalid tokens
- [ ] Test authorization with unauthorized users
- [ ] Test CORS configuration
- [ ] Test rate limiting enforcement
- [ ] Validate input sanitization
Monitoring & Maintenance
Operational Procedures
- [ ] Document runbook for common issues
- [ ] Document rollback procedures
- [ ] Set up log aggregation and search
- [ ] Configure automated backups for databases
- [ ] Test disaster recovery procedures
- [ ] Document on-call escalation process
Continuous Monitoring
- [ ] Monitor error rates and latency trends
- [ ] Monitor resource utilization (CPU, memory, disk)
- [ ] Monitor database connection pool usage
- [ ] Monitor cache hit rates
- [ ] Track API version usage
- [ ] Monitor circuit breaker metrics
Regular Maintenance
- [ ] Update dependencies regularly (
cargo update) - [ ] Scan for security vulnerabilities regularly
- [ ] Review and rotate credentials quarterly
- [ ] Review and update Cedar policies as needed
- [ ] Monitor and archive old logs
- [ ] Review and optimize database queries
Go-Live Checklist
Final checks before deploying to production:
- [ ] All above sections reviewed and completed
- [ ] Configuration reviewed and approved
- [ ] Security audit completed
- [ ] Load testing passed
- [ ] Monitoring and alerting configured
- [ ] Runbook documented
- [ ] Rollback plan prepared
- [ ] Team trained on operational procedures
- [ ] Deploy to staging environment first
- [ ] Verify staging deployment before production
- [ ] Schedule deployment during low-traffic window
- [ ] Monitor deployment closely for first 24 hours
Post-Deployment
After deploying to production:
- [ ] Verify health endpoints return 200
- [ ] Verify metrics are being collected
- [ ] Verify logs are being aggregated
- [ ] Verify traces are being exported
- [ ] Test critical API endpoints
- [ ] Monitor error rates for 1 hour
- [ ] Verify autoscaling works as expected
- [ ] Update documentation with production URLs
- [ ] Notify stakeholders of successful deployment
Next Steps
- Configuration for detailed configuration options
- Observability for monitoring and tracing setup
- Docker for container best practices
- Kubernetes for orchestration details