Skip to content
Node.js nd microservices 5 min read

Service Discovery

In a microservices system, instances come and go constantly. Containers are rescheduled, autoscalers add and remove replicas, and IP addresses change on every deploy. Hard-coding hostnames quickly breaks. Service discovery solves this by letting a service ask “where is the payments service right now?” and get back a current, healthy set of network locations. This page covers the two main discovery models, registries like Consul and etcd, health checking, and how DNS-based discovery works natively in Kubernetes.

Why discovery is needed

A static configuration assumes endpoints never move. But a typical orders service might run across five pods spread over three nodes, each with a private, ephemeral IP. When you scale to ten pods or roll out a new version, that list changes within seconds. Service discovery decouples the logical name of a service from its physical instances, and keeps the mapping fresh as the topology shifts.

The core building block is a service registry: a database of (service name → list of healthy instances). Instances register themselves on startup, deregister on shutdown, and prove they are alive through health checks. Callers query the registry to resolve a name into an address.

Client-side vs server-side discovery

There are two architectural patterns for how a caller turns a service name into a connection.

In client-side discovery, the calling service queries the registry directly, gets the full list of instances, and picks one itself (round-robin, least-connections, etc.). The client owns load balancing. This is efficient — no extra network hop — but every client must embed discovery and balancing logic.

In server-side discovery, the client sends the request to a fixed endpoint (a load balancer or gateway), and that intermediary consults the registry and forwards the request. The client stays simple; the routing logic lives in one place. Kubernetes Services and most API gateways work this way.

AspectClient-sideServer-side
Load-balancing logicIn each clientIn LB / gateway
Network hopsOne (direct)Two (via LB)
Client complexityHigherLower
ExampleConsul + client libKubernetes Service, API gateway
Failure isolationPer clientCentralized

Service registries: Consul and etcd

Consul (HashiCorp) and etcd (CNCF) are the two most common registries. Consul is purpose-built for discovery with HTTP/DNS interfaces and built-in health checking. etcd is a strongly consistent key-value store (it backs Kubernetes itself) often used for discovery via watches.

Registering an instance with Consul is a single HTTP call. Here is a small helper that registers on boot and deregisters on SIGTERM.

// registry.js — ES module
import { hostname } from "node:os";
import { randomUUID } from "node:crypto";

const CONSUL = process.env.CONSUL_URL ?? "http://localhost:8500";

export async function register({ name, port }) {
  const id = `${name}-${hostname()}-${randomUUID().slice(0, 8)}`;
  const body = {
    ID: id,
    Name: name,
    Address: hostname(),
    Port: port,
    Check: {
      HTTP: `http://${hostname()}:${port}/health`,
      Interval: "10s",
      Timeout: "2s",
      DeregisterCriticalServiceAfter: "1m",
    },
  };

  const res = await fetch(`${CONSUL}/v1/agent/service/register`, {
    method: "PUT",
    headers: { "content-type": "application/json" },
    body: JSON.stringify(body),
  });
  if (!res.ok) throw new Error(`register failed: ${res.status}`);

  const deregister = () =>
    fetch(`${CONSUL}/v1/agent/service/deregister/${id}`, { method: "PUT" });
  process.on("SIGTERM", async () => {
    await deregister();
    process.exit(0);
  });

  return { id, deregister };
}

CommonJS users: replace the import lines with const { hostname } = require("node:os") and export via module.exports. The node: prefix works identically in both module systems.

Resolving a name is just a query for healthy instances. Consul’s /health/service/<name>?passing endpoint returns only instances passing their checks.

// resolve.js
const CONSUL = process.env.CONSUL_URL ?? "http://localhost:8500";

export async function resolve(name) {
  const res = await fetch(
    `${CONSUL}/v1/health/service/${name}?passing=true`,
  );
  const entries = await res.json();
  return entries.map((e) => ({
    address: e.Service.Address,
    port: e.Service.Port,
  }));
}

// client-side round-robin
let counter = 0;
export async function pickInstance(name) {
  const instances = await resolve(name);
  if (instances.length === 0) throw new Error(`no healthy ${name}`);
  return instances[counter++ % instances.length];
}
const { address, port } = await pickInstance("payments");
const res = await fetch(`http://${address}:${port}/charge`, {
  method: "POST",
  body: JSON.stringify({ amount: 4200 }),
});
console.log("status", res.status);

Output:

status 200

Health checks

A registry is only as good as its health information. An instance that crashed but never deregistered is a zombie that will sink requests into a black hole. Health checks let the registry evict such instances automatically.

Expose a lightweight /health endpoint that verifies the instance can actually do work — for example, that its database pool is reachable — rather than just returning 200 unconditionally.

import { createServer } from "node:http";

const server = createServer(async (req, res) => {
  if (req.url === "/health") {
    try {
      await db.query("SELECT 1");
      res.writeHead(200).end("ok");
    } catch {
      res.writeHead(503).end("db unavailable");
    }
    return;
  }
  // ... normal request handling
  res.writeHead(404).end();
});

server.listen(3000, () => console.log("listening on :3000"));

Consul polls this endpoint every Interval; after the instance is critical for DeregisterCriticalServiceAfter, Consul removes it entirely.

DNS-based discovery in Kubernetes

Kubernetes provides server-side discovery out of the box, so you usually do not run a separate registry. Every Service object gets a stable virtual IP and a DNS name of the form <service>.<namespace>.svc.cluster.local. The cluster’s DNS (CoreDNS) resolves that name, and kube-proxy load-balances across the healthy pods behind it.

// Inside the cluster, just use the Service DNS name — no registry client needed.
const res = await fetch("http://payments.default.svc.cluster.local/charge", {
  method: "POST",
  body: JSON.stringify({ amount: 4200 }),
});

Within the same namespace you can even shorten it to http://payments. Readiness probes are the Kubernetes equivalent of health checks: a pod that fails its readiness probe is removed from the Service’s endpoint list, so DNS naturally stops routing to it.

$ kubectl get endpoints payments
NAME       ENDPOINTS                                   AGE
payments   10.1.2.3:8080,10.1.2.7:8080,10.1.4.5:8080   6d

For client-side load balancing in Kubernetes you can use a headless Service (clusterIP: None), which makes DNS return all pod IPs instead of a single virtual IP — useful for gRPC, where a single connection would otherwise pin to one pod.

Best Practices

  • Deregister cleanly on SIGTERM and also set a DeregisterCriticalServiceAfter so crashed instances cannot linger as zombies.
  • Make health checks meaningful — verify downstream dependencies, but keep them fast and side-effect free.
  • Cache resolution results briefly (a few seconds) and refresh in the background to avoid hammering the registry on every request.
  • Prefer the platform’s native discovery (Kubernetes DNS) before adding a standalone registry; fewer moving parts means fewer failure modes.
  • Use headless Services or a client-side balancer for long-lived connections like gRPC and HTTP/2, where virtual-IP balancing breaks down.
  • Treat the registry as a critical, replicated component — run Consul or etcd as a quorum cluster, never a single node.
Last updated June 14, 2026
Was this helpful?