Kubernetes Deployment

Kubernetes (k8s) orchestrates containers: it schedules them, restarts crashed instances, scales replicas, and routes traffic. To run a Spring Boot app well on Kubernetes you need a container image (see Dockerizing) plus manifests that tell the cluster how to check health, inject configuration, bound resources, and shut down cleanly. Spring Boot 3 has dedicated Kubernetes support that makes each of these align naturally.

A Deployment and a Service

A Deployment manages a set of identical Pods (your app replicas). A Service gives them a stable virtual IP and load-balances across them.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders
  labels:
    app: orders
spec:
  replicas: 3
  selector:
    matchLabels:
      app: orders
  template:
    metadata:
      labels:
        app: orders
    spec:
      containers:
        - name: orders
          image: registry.example.com/orders:1.4.0
          ports:
            - containerPort: 8080

apiVersion: v1
kind: Service
metadata:
  name: orders
spec:
  selector:
    app: orders
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

Wiring probes to Actuator health groups

Kubernetes uses two probes with distinct meanings: the liveness probe answers “is this process broken and in need of a restart?”, and the readiness probe answers “can it accept traffic right now?”. Spring Boot Actuator exposes both as dedicated endpoints when running on Kubernetes.

Enable the probe endpoints:

management:
  endpoint:
    health:
      probes:
        enabled: true
  endpoints:
    web:
      exposure:
        include: health

This publishes /actuator/health/liveness and /actuator/health/readiness. Spring Boot auto-detects Kubernetes and enables these, but the property makes it explicit. Point the probes at them:

          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

Probe	Spring endpoint	If it fails
Liveness	`/actuator/health/liveness`	Pod is restarted
Readiness	`/actuator/health/readiness`	Pod removed from Service endpoints (no traffic)
Startup (optional)	`/actuator/health/liveness`	Holds off liveness until app finishes booting

Warning: Do not point the liveness probe at the full /actuator/health endpoint. If a downstream dependency (database, broker) goes down, the full health check fails and Kubernetes will restart your perfectly healthy Pods in a crash loop. Liveness should reflect only the process itself; let readiness reflect dependencies.

Injecting configuration: ConfigMaps and Secrets

Keep configuration out of the image. A ConfigMap holds non-sensitive settings; a Secret holds credentials (base64-encoded, and ideally encrypted at rest).

apiVersion: v1
kind: ConfigMap
metadata:
  name: orders-config
data:
  SPRING_PROFILES_ACTIVE: "prod"
  APP_FEATURE_FLAG: "true"
---
apiVersion: v1
kind: Secret
metadata:
  name: orders-secret
type: Opaque
stringData:
  SPRING_DATASOURCE_PASSWORD: "s3cr3t"

Expose them to the container as environment variables. Spring Boot maps SPRING_DATASOURCE_PASSWORD to spring.datasource.password automatically via relaxed binding.

          envFrom:
            - configMapRef:
                name: orders-config
            - secretRef:
                name: orders-secret

For stronger secret handling than base64 ConfigData, see Secrets & Vault.

Resource requests and limits

Tell the scheduler how much CPU and memory the Pod needs (requests) and the ceiling it may not exceed (limits). Without limits a single Pod can starve its neighbours.

          resources:
            requests:
              cpu: "250m"
              memory: "512Mi"
            limits:
              cpu: "1"
              memory: "768Mi"

Tip: A modern JVM is container-aware and sizes the heap from the cgroup memory limit automatically. Set a generous memory limit (heap plus metaspace, threads, and off-heap) so the kernel OOM-killer never terminates the JVM unexpectedly.

Aligning graceful shutdown

When Kubernetes scales down or rolls out a new version, it sends SIGTERM and waits for the Pod’s terminationGracePeriodSeconds before sending SIGKILL. Spring Boot’s graceful shutdown lets in-flight requests finish first.

server:
  shutdown: graceful
spring:
  lifecycle:
    timeout-per-shutdown-phase: 25s

    spec:
      terminationGracePeriodSeconds: 30   # must exceed the Spring shutdown timeout

The sequence on a rolling update:

1. Pod marked Terminating, removed from Service endpoints (readiness)
2. Kubernetes sends SIGTERM
3. Spring stops accepting new requests, lets in-flight ones complete (<= 25s)
4. JVM exits cleanly before the 30s grace period ends -> no SIGKILL

Keep terminationGracePeriodSeconds strictly larger than timeout-per-shutdown-phase, or Kubernetes will hard-kill the Pod mid-request.

Best Practices

Use the readiness probe for dependencies; keep the liveness probe limited to the process to avoid restart storms.
Externalize all config via ConfigMaps and Secrets — never bake them into the image.
Set both resource requests and limits; let the JVM size its heap from the container limit.
Enable server.shutdown: graceful and make the grace period longer than the shutdown timeout.
Run multiple replicas behind a Service so rolling updates cause zero downtime.