glob & file Loaders

Astro’s Content Layer ships with two batteries-included loaders that cover the vast majority of local content needs: glob() and file(). The glob() loader pulls in many entries from many files matching a pattern (perfect for a folder of Markdown blog posts), while file() reads a single file containing many entries (perfect for a JSON catalog). Both are imported from astro/loaders and wired into a collection’s loader property, giving you type-safe, schema-validated data with zero custom code.

When to reach for each loader

The two loaders answer different shapes of the same question — “where does this collection’s data live?” Choose based on how your content is physically laid out on disk.

Loader	Source shape	One entry per…	Typical use
`glob()`	Many files matching a glob pattern	File	Blog posts, docs pages, MDX articles
`file()`	A single structured data file	Array element / key	Authors, products, i18n strings, config data

Both loaders are part of the Content Layer API introduced in Astro 5. They replace the legacy “content folder is magic” behavior — collections are now explicit and can live anywhere in your project, not just under src/content/.

Using the glob() loader

glob() scans a directory for files matching a pattern and produces one collection entry per file. It natively understands .md, .mdx, .markdown, .json, and .yaml/.yml. You configure it in src/content.config.ts by passing a pattern (a glob string or array) and a base directory to resolve it against.

// src/content.config.ts
import { defineCollection, z } from 'astro:content';
import { glob } from 'astro/loaders';

const blog = defineCollection({
  loader: glob({ pattern: '**/*.{md,mdx}', base: './src/data/blog' }),
  schema: z.object({
    title: z.string(),
    pubDate: z.coerce.date(),
    draft: z.boolean().default(false),
    tags: z.array(z.string()).default([]),
  }),
});

export const collections = { blog };

Each Markdown file’s frontmatter is parsed and validated against the schema, and the body becomes renderable content. The entry id is derived from the file path relative to base (minus the extension), so src/data/blog/hello-world.md becomes the id hello-world.

Querying and rendering glob entries

Because glob() Markdown entries carry a body, you get a render() helper for turning them into a <Content /> component.

---
// src/pages/blog/[...slug].astro
import { getCollection, render } from 'astro:content';

export async function getStaticPaths() {
  const posts = await getCollection('blog', ({ data }) => !data.draft);
  return posts.map((post) => ({
    params: { slug: post.id },
    props: { post },
  }));
}

const { post } = Astro.props;
const { Content } = await render(post);
---

<article>
  <h1>{post.data.title}</h1>
  <time datetime={post.data.pubDate.toISOString()}>
    {post.data.pubDate.toLocaleDateString()}
  </time>
  <Content />
</article>

This page renders fully static HTML with zero client-side JavaScript — the Markdown is compiled at build time and shipped as plain HTML.

Customizing entry ids

Pass generateId to override the default id derivation, which is handy for clean URLs that strip date prefixes or honor a frontmatter slug.

loader: glob({
  pattern: '**/*.md',
  base: './src/data/blog',
  generateId: ({ entry, data }) =>
    (data.slug as string) ?? entry.replace(/^\d{4}-\d{2}-\d{2}-/, ''),
}),

Using the file() loader

file() reads exactly one file and emits multiple entries from it. By default it expects a JSON array of objects (each needing an id field) or a JSON object keyed by id. It also handles YAML and CSV, and you can supply a custom parser for anything else.

// src/content.config.ts
import { defineCollection, z } from 'astro:content';
import { file } from 'astro/loaders';

const authors = defineCollection({
  loader: file('src/data/authors.json'),
  schema: z.object({
    id: z.string(),
    name: z.string(),
    twitter: z.string().url().optional(),
  }),
});

export const collections = { authors };

// src/data/authors.json
[
  { "id": "ada",    "name": "Ada Lovelace", "twitter": "https://x.com/ada" },
  { "id": "linus",  "name": "Linus Torvalds" }
]

If your JSON nests the array under a property, point at it with the parser option:

loader: file('src/data/db.json', {
  parser: (text) => JSON.parse(text).people,
}),

You then query file() collections exactly like any other — getEntry('authors', 'ada') or getCollection('authors'). Since these entries have no Markdown body, there is nothing to render(); you read fields straight off entry.data.

Output:

{ id: 'ada', data: { id: 'ada', name: 'Ada Lovelace', twitter: 'https://x.com/ada' } }

Gotcha: every object the file() loader returns must have a unique id (or you must use the object-keyed form). A missing or duplicate id throws a clear build-time error rather than silently dropping entries.

Referencing across collections

A common pattern is linking glob() posts to file() authors with reference(), keeping data normalized and type-checked.

import { defineCollection, reference, z } from 'astro:content';

const blog = defineCollection({
  loader: glob({ pattern: '**/*.md', base: './src/data/blog' }),
  schema: z.object({
    title: z.string(),
    author: reference('authors'),
  }),
});

At query time, resolve the reference with getEntry(post.data.author) to fetch the full author record.

Best Practices

Keep one collection per content shape — a glob() for prose, a file() for structured lookup tables.
Always define a Zod schema; it turns malformed frontmatter and bad JSON into build-time errors instead of runtime surprises.
Use a stable base directory and let glob() derive ids rather than hand-maintaining slugs, unless you need custom URLs via generateId.
Filter drafts in getCollection() callbacks, not in templates, so unpublished content never reaches the build output.
Prefer reference() over duplicating author or category data across entries to keep your content DRY and type-safe.
Co-locate content with your app (anywhere under the project root) — the Content Layer no longer requires the src/content/ directory.