Embedded vs Referenced

MongoDB has no joins in the relational sense, so modeling relationships is a deliberate design choice. You either embed related data inside the parent document or reference it by id in another collection. Picking correctly is the single most important MongoDB modeling decision.

The three approaches

Approach	How it is stored	Reads	Best when
Embedded	Child nested inside parent	One query, no join	Child is owned by and read with the parent
`@DBRef`	Pointer document `{ $ref, $id }`	Extra query to resolve	Shared child, managed by Spring
Manual reference	Plain id field (`String`)	Explicit lookup you control	Shared child, you want full control

Embedded documents

Embedding stores the child inline. There is no separate collection and no second query — the whole aggregate is read and written together.

@Document("orders")
public record Order(
        @Id String id,
        String customerName,
        List<LineItem> items,   // embedded
        Address shippingAddress // embedded
) { }

public record LineItem(String sku, String name, int quantity, BigDecimal price) { }
public record Address(String street, String city, String zip) { }

The stored document is fully self-contained:

{
  "_id": "65f1c2a8e4b0a1d2c3e4f567",
  "customerName": "Jane Doe",
  "items": [
    { "sku": "KB-001", "name": "Keyboard", "quantity": 1, "price": 79.99 },
    { "sku": "MO-002", "name": "Mouse", "quantity": 2, "price": 39.50 }
  ],
  "shippingAddress": { "street": "1 Main St", "city": "Austin", "zip": "78701" }
}

Tip: Embed when the child has no independent life cycle and is almost always read with its parent — order line items, an address, audit metadata. This is the idiomatic MongoDB model and usually the right default.

@DBRef references

@DBRef stores a pointer to a document in another collection and lets Spring resolve it on load. Use it when the same child is shared by many parents and should not be duplicated.

@Document("orders")
public class Order {
    @Id
    private String id;

    @DBRef
    private Customer customer;   // separate "customers" collection

    private List<LineItem> items;
}

On load, Spring issues an extra query per @DBRef to fetch the referenced Customer. The stored order keeps only a reference:

{
  "_id": "65f1c2a8e4b0a1d2c3e4f567",
  "customer": { "$ref": "customers", "$id": "65f0a1b2c3d4e5f600112233" },
  "items": [ ... ]
}

Warning: @DBRef resolves eagerly by default, adding a query for each reference and risking an N+1 pattern over a list. Add @DBRef(lazy = true) to defer the lookup, but be aware it proxies the object. For hot paths, a manual reference is often faster.

Manual references

The lightest option is to store just the id as a plain field and look the child up yourself when you need it. This avoids @DBRef overhead and keeps the query explicit.

@Document("orders")
public record Order(@Id String id, String customerId, List<LineItem> items) { }

Order order = orderRepository.findById(orderId).orElseThrow();
Customer customer = customerRepository.findById(order.customerId()).orElseThrow();

You decide exactly when (and whether) to resolve the reference — ideal when the customer is not always needed, or when you prefer to batch lookups with findAllById.

When to embed vs reference

Embed when all of these hold:

The child belongs to exactly one parent.
You read the child together with the parent.
The combined document stays well under MongoDB’s 16 MB limit.
The child does not need to be queried independently across parents.

Reference (manual or @DBRef) when:

The child is shared by many parents (a Customer across many Orders).
The child is large or grows unbounded (avoid huge embedded arrays).
The child must be queried or updated on its own.

Note: Some designs denormalize — duplicate a few fields (e.g. customerName) into the parent for fast reads while keeping a reference for the full record. This trades storage and update complexity for read speed, a common and accepted MongoDB pattern.

Contrast with relational modeling

In SQL you normalize aggressively and join at read time; in MongoDB you usually embed the aggregate and accept controlled duplication. If your data is highly relational with many cross-cutting joins, that is a signal a relational store may fit better — compare with Spring Data JPA.