Using the Astro Content Layer with a Github repository

As I migrated my content from src/content in my website (that uses Astro) to a dedicated GitHub repository, I discovered the power of Astro’s new Content Layer for loading external content. This guide walks you through the process of implementing this approach, similar to like I did.

Note: You need to be using Astro 5 or later for these features!

Prerequisites

Before getting started, ensure you have:

  • An Astro project (version 5+)
  • A GitHub repository to host your content
  • A GitHub API token

Step 1: Integrating the Octokit SDK

Begin by installing the Github Octokit library:

npm install octokit

Configuring GitHub Token and Environment

Create a GitHub API token by visiting GitHub’s token settings. Then, configure the token in your Astro configuration:

// astro.config.ts
import { defineConfig } from "astro/config";
import { envField } from "astro/env";

export default defineConfig({
  // ...
  env: {
    GITHUB_TOKEN: envField.string({
      context: "server",
      access: "secret",
    }),
  },
  // ...
});

Then, set the env in the .env file, or, for Cloudflare adapter users, set the token in .dev.vars:

GITHUB_TOKEN="<TOKEN HERE>"

Creating an Octokit Instance

Create a file to initialize the Octokit client:

// /src/lib/octokit.ts
import { GITHUB_TOKEN } from "astro:env/server";
import { Octokit } from "octokit";

export const octokit = new Octokit({
  auth: GITHUB_TOKEN,
});

Content Repository Structure

My content repository (named metadata) follows this organized structure:

.
├── archives
│   └── *.md
├── dictionary
│   └── *.md
├── logs
│   └── *.md
├── reviews
│   ├── books
│   │   └── *.md
│   ├── games
│   │   └── *.md
│   ├── movies
│   │   └── *.md
│   ├── music
│   │   └── *.md
│   └── series
│       └── *.md
└── writings
    └── *.md

Implementing the Custom Content Loader

The Astro Content Loader is a powerful feature that allows you to load content from various sources, both local and remote. For a comprehensive understanding, refer to the official Astro Content Loader documentation, but basically, as they said:

Astro’s Content Loader API allows you to load your data from any source, local or remote, and interact with Astro’s content layer to manage your content collections.

Create a generic loader that will work based on the structure give above in src/loaders/content.ts to fetch data from your repository:

import { octokit } from "@/lib/octokit";
import type { Loader } from "astro/loaders";
import { z } from "astro/zod";
import matter from "gray-matter";

export function githubContentLoader<T extends z.ZodTypeAny>({
  owner,
  repo,
  folder,
  schema,
}: {
  owner: string;
  repo: string;
  folder: string;
  schema: T;
}): Loader {
  return {
    name: `${folder}-loader`,
    schema,
    load: async ({ store, logger, parseData, generateDigest, meta }) => {
      logger.info(`Loading ${folder}`);

      // Get the last modified date from metadata
      const lastModified = meta.get("last-modified");

      try {
        // Fetch the content of the folder from GitHub
        const response = await octokit.rest.repos.getContent({
          owner,
          repo,
          path: `/${folder}`,
          headers: {
            "If-Modified-Since": lastModified,
          },
        });

        // If content hasn't changed since last fetch, return early
        if (
          lastModified &&
          response.headers["last-modified"] === lastModified
        ) {
          logger.info(`No changes in ${folder}`);
          return;
        }

        // Update the last-modified metadata
        meta.set("last-modified", response.headers["last-modified"]!);

        // Ensure response.data is an array (folder contents)
        if (!Array.isArray(response.data)) {
          logger.info(`No ${folder} found`);
          return;
        }

        // Clear existing store before loading new content
        store.clear();

        // Process each file in the folder
        await Promise.all(
          response.data.map(async (file) => {
            // Fetch individual file content
            const { data } = await octokit.rest.repos.getContent({
              owner,
              repo,
              path: `/${folder}/${file.name}`,
            });

            // Decode base64 content if available
            const fileContent =
              "content" in data
                ? Buffer.from(data.content, "base64").toString("utf-8")
                : "";

            // Parse frontmatter and content using gray-matter
            const { data: frontmatter, content } = matter(fileContent);
            const parsedData = await parseData({
              id: file.name.replace(".md", ""),
              data: frontmatter,
            });

            // Store the processed content
            store.set({
              id: file.name.replace(".md", ""),
              data: parsedData,
              rendered: { html: content },
              digest: generateDigest(content),
            });
          })
        );
      } catch (error) {
        logger.error(`Error loading ${folder}`);
        throw error;
      }
    },
  };
}

Configuring Content Collections

Now, we can define our content collections in src/content/config.ts

import { githubContentLoader } from "@/loaders/content";
import z from "astro/zod";
import { defineCollection } from "astro:content";

const dictionary = defineCollection({
  type: "content_layer",
  loader: githubContentLoader({
    folder: "dictionary",
    owner: "guilhermohounie",
    repo: "metadata",
    schema: z.object({
      concept: z.string(),
      source: z.string().url(),
    }),
  }),
});

const books = defineCollection({
  type: "content_layer",
  loader: githubContentLoader({
    folder: "reviews/books",
    owner: "guilhermohounie",
    repo: "metadata",
    schema: z.object({
      title: z.string(),
      author: z.string(),
      year: z.string(),
      score: z.number().min(0).max(5),
      link: z.string().url(),
      date: z.date(),
    }),
  }),
});

// ...

export const collections = {
  books,
  dictionary,
  // ...
};

Using the Content in Astro Pages

Finally, retrieve and render your content in Astro pages. Since we defined a schema, it is typesafe and a joy to work with under typescript. Here is a sample of how I use it under my dictionary page.

---
import BaseLayout from "@/layouts/base-layout.astro";
import { markdown } from "@astropub/md";
import { getCollection } from "astro:content";

const dictionary = await getCollection("dictionary");
---

<BaseLayout
  title="Dictionary"
  header={{
    title: "Dictionary",
    description:
      "A collection of terms and concepts that I learned and found interesting."
  }}
>
  {
    dictionary
      .sort((a, b) => a.data.concept.localeCompare(b.data.concept))
      .map(async (word) => {
        const {
          data: { concept, source }
        } = word;
        const content = markdown(word.rendered?.html!);
        return (
          <article class="space-y-2">
            <a
              target="_blank"
              rel="noopener noreferrer"
              href={source}
              class="uppercase italic"
            >
              {concept}
            </a>
            <div class="flex gap-2 text-justify">
              <span class="text-5xl text-yellow">&OpenCurlyDoubleQuote;</span>
              {content}
            </div>
          </article>
        );
      })
  }
</BaseLayout>

A note: I got lazy implementing a markdown to html parser that rendered html close to what astro renders, so I do not use the render() function exported from astro "astro:content" and instead I used a library that integrates with astro to generate html from markdown, named @astropub/md

Conclusion

And that’s it! I really enjoyed using this new API. it’s powerfull, well thought and easy to implement.