Kubernetes Deployment
Kubernetes (k8s) orchestrates containers: it schedules them, restarts crashed instances, scales replicas, and routes traffic. To run a Spring Boot app well on Kubernetes you need a container image (see Dockerizing) plus manifests that tell the cluster how to check health, inject configuration, bound resources, and shut down cleanly. Spring Boot 3 has dedicated Kubernetes support that makes each of these align naturally.
A Deployment and a Service
A Deployment manages a set of identical Pods (your app replicas). A Service gives them a stable virtual IP and load-balances across them.
apiVersion: apps/v1
kind: Deployment
metadata:
name: orders
labels:
app: orders
spec:
replicas: 3
selector:
matchLabels:
app: orders
template:
metadata:
labels:
app: orders
spec:
containers:
- name: orders
image: registry.example.com/orders:1.4.0
ports:
- containerPort: 8080
apiVersion: v1
kind: Service
metadata:
name: orders
spec:
selector:
app: orders
ports:
- port: 80
targetPort: 8080
type: ClusterIP
Wiring probes to Actuator health groups
Kubernetes uses two probes with distinct meanings: the liveness probe answers “is this process broken and in need of a restart?”, and the readiness probe answers “can it accept traffic right now?”. Spring Boot Actuator exposes both as dedicated endpoints when running on Kubernetes.
Enable the probe endpoints:
management:
endpoint:
health:
probes:
enabled: true
endpoints:
web:
exposure:
include: health
This publishes /actuator/health/liveness and /actuator/health/readiness. Spring Boot auto-detects Kubernetes and enables these, but the property makes it explicit. Point the probes at them:
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
| Probe | Spring endpoint | If it fails |
|---|---|---|
| Liveness | /actuator/health/liveness | Pod is restarted |
| Readiness | /actuator/health/readiness | Pod removed from Service endpoints (no traffic) |
| Startup (optional) | /actuator/health/liveness | Holds off liveness until app finishes booting |
Warning: Do not point the liveness probe at the full
/actuator/healthendpoint. If a downstream dependency (database, broker) goes down, the full health check fails and Kubernetes will restart your perfectly healthy Pods in a crash loop. Liveness should reflect only the process itself; let readiness reflect dependencies.
Injecting configuration: ConfigMaps and Secrets
Keep configuration out of the image. A ConfigMap holds non-sensitive settings; a Secret holds credentials (base64-encoded, and ideally encrypted at rest).
apiVersion: v1
kind: ConfigMap
metadata:
name: orders-config
data:
SPRING_PROFILES_ACTIVE: "prod"
APP_FEATURE_FLAG: "true"
---
apiVersion: v1
kind: Secret
metadata:
name: orders-secret
type: Opaque
stringData:
SPRING_DATASOURCE_PASSWORD: "s3cr3t"
Expose them to the container as environment variables. Spring Boot maps SPRING_DATASOURCE_PASSWORD to spring.datasource.password automatically via relaxed binding.
envFrom:
- configMapRef:
name: orders-config
- secretRef:
name: orders-secret
For stronger secret handling than base64 ConfigData, see Secrets & Vault.
Resource requests and limits
Tell the scheduler how much CPU and memory the Pod needs (requests) and the ceiling it may not exceed (limits). Without limits a single Pod can starve its neighbours.
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1"
memory: "768Mi"
Tip: A modern JVM is container-aware and sizes the heap from the cgroup memory limit automatically. Set a generous memory limit (heap plus metaspace, threads, and off-heap) so the kernel OOM-killer never terminates the JVM unexpectedly.
Aligning graceful shutdown
When Kubernetes scales down or rolls out a new version, it sends SIGTERM and waits for the Pod’s terminationGracePeriodSeconds before sending SIGKILL. Spring Boot’s graceful shutdown lets in-flight requests finish first.
server:
shutdown: graceful
spring:
lifecycle:
timeout-per-shutdown-phase: 25s
spec:
terminationGracePeriodSeconds: 30 # must exceed the Spring shutdown timeout
The sequence on a rolling update:
1. Pod marked Terminating, removed from Service endpoints (readiness)
2. Kubernetes sends SIGTERM
3. Spring stops accepting new requests, lets in-flight ones complete (<= 25s)
4. JVM exits cleanly before the 30s grace period ends -> no SIGKILL
Keep terminationGracePeriodSeconds strictly larger than timeout-per-shutdown-phase, or Kubernetes will hard-kill the Pod mid-request.
Best Practices
- Use the readiness probe for dependencies; keep the liveness probe limited to the process to avoid restart storms.
- Externalize all config via ConfigMaps and Secrets — never bake them into the image.
- Set both resource requests and limits; let the JVM size its heap from the container limit.
- Enable
server.shutdown: gracefuland make the grace period longer than the shutdown timeout. - Run multiple replicas behind a Service so rolling updates cause zero downtime.