Content & Data Questions

Content management is one of Astro’s strongest selling points, so interviewers probe it heavily for content-heavy roles. They want to know whether you understand how Astro turns Markdown, MDX, and remote data into type-safe, queryable collections, and how the Content Layer API in Astro 5 generalized that beyond local files. The questions below cover collections, Zod schemas, loaders, and where data fetching actually happens in the rendering lifecycle.

What are content collections and why use them?

Content collections are Astro’s way of organizing and validating sets of related content — blog posts, docs, authors, products. Instead of scattering frontmatter parsing and import.meta.glob calls across your app, you define a collection once with a schema, and Astro gives you type-safe query functions plus build-time validation.

The headline benefits are: type safety (your editor knows every frontmatter field), schema validation (builds fail fast on bad data), and a stable query API (getCollection, getEntry) that decouples your pages from where content lives.

// src/content.config.ts
import { defineCollection, z } from 'astro:content';
import { glob } from 'astro/loaders';

const blog = defineCollection({
  loader: glob({ pattern: '**/*.md', base: './src/data/blog' }),
  schema: z.object({
    title: z.string(),
    pubDate: z.coerce.date(),
    draft: z.boolean().default(false),
    tags: z.array(z.string()).default([]),
  }),
});

export const collections = { blog };

In Astro 5 the config file moved to src/content.config.ts and collections require an explicit loader. The legacy src/content/config.ts with folder-based collections still works but is considered the “legacy” Content Collections API.

What is the Content Layer API and how does it differ from legacy collections?

The Content Layer API, stabilized in Astro 5, decouples collections from the filesystem. Legacy collections could only read Markdown/MDX/JSON files inside src/content/. The Content Layer introduces loaders — functions that fetch content from anywhere (local globs, a CMS, a remote API, a database) and emit entries into a store.

Astro ships two built-in loaders, glob() and file(), and you can write custom ones. The result is the same unified getCollection API regardless of source, plus persistent caching and faster builds for large content sets.

Aspect	Legacy collections	Content Layer (Astro 5)
Config file	`src/content/config.ts`	`src/content.config.ts`
Content location	Only `src/content/`	Anywhere via loaders
Data source	Local files only	Files, APIs, CMS, DB
Loader	Implicit, folder-based	Explicit (`glob`, `file`, custom)
Caching	None	Persistent data store

How do Zod schemas validate content?

Each collection’s schema is a Zod schema (or a function returning one for image/reference helpers). At build time Astro parses every entry’s frontmatter and runs it through the schema. If validation fails — a missing title, a string where a date was expected — the build errors with the offending file and field. This shifts content bugs to build time instead of runtime.

Schemas also transform data: z.coerce.date() turns an ISO string into a Date, and .default() fills omitted fields.

import { defineCollection, reference, z } from 'astro:content';

const authors = defineCollection({
  loader: glob({ pattern: '**/*.json', base: './src/data/authors' }),
  schema: z.object({ name: z.string(), avatar: z.string().url() }),
});

const blog = defineCollection({
  loader: glob({ pattern: '**/*.md', base: './src/data/blog' }),
  schema: ({ image }) =>
    z.object({
      title: z.string(),
      cover: image(),                 // validated, optimizable image
      author: reference('authors'),   // FK-style link to authors collection
    }),
});

The image() helper validates the path and returns metadata Astro’s <Image /> can optimize, while reference() creates a typed link you resolve later with getEntry.

How do you query collections in a page?

You query collections inside the component script (the --- fence), which runs at build time for static pages or per-request in SSR. getCollection returns an array of entries; getEntry fetches one. Markdown body is rendered via the render() function from astro:content.

---
import { getCollection, getEntry, render } from 'astro:content';

// list page
const posts = (await getCollection('blog', ({ data }) => !data.draft))
  .sort((a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf());

// resolve a reference and render body
const first = posts[0];
const author = await getEntry(first.data.author);
const { Content } = await render(first);
---
<h1>{first.data.title}</h1>
<p>By {author.data.name}</p>
<Content />

Where does data fetching happen in Astro?

A frequent question. The component script (between the --- fences) runs on the server — at build time for prerendered pages, or on the server per request when output: 'server' / on-demand rendering is enabled. Top-level await is supported, so you fetch directly there. None of that code ships to the browser, which is the essence of zero-JS-by-default.

---
const res = await fetch('https://api.example.com/products');
const products = await res.json();
---
<ul>{products.map((p) => <li>{p.name}</li>)}</ul>

Client-side fetching only happens inside hydrated island components (client:load, etc.). If the data should be in the initial HTML, fetch it in the frontmatter; if it depends on user interaction, fetch it in an island.

Best practices

Define one schema per collection and lean on z.coerce and .default() so frontmatter stays minimal but typed.
Prefer the Content Layer (src/content.config.ts with explicit loaders) for new projects; write custom loaders for CMS or API sources.
Use reference() for relationships between collections instead of duplicating data, then resolve with getEntry.
Filter drafts and sort inside getCollection rather than in templates to keep pages declarative.
Fetch page data in the frontmatter (server-side) by default; reserve client-side fetching for genuinely interactive islands.
Run astro sync (or rely on astro dev) to regenerate astro:content types after schema changes.