Adam DJ Brett

Home / Blog / Making Static Sites FAIR with metadata.json, Zotero, Eleventy, and Jekyll

Table Of Contents

I have been thinking a lot about what it means for independent scholarly websites to be readable by more than humans. A good static site should be fast, durable, and readable in the browser. But if the site is publishing scholarly writing, reviews, podcast episodes, transcripts, or resources that people may cite later, it also needs to be readable by Zotero, search engines, archives, citation managers, and whatever discovery systems come next.

That is the larger reason I started adding FAIR Signposting and per-page metadata.json files to my Eleventy sites.

The immediate spark was Martin Paul Eve's post, "FAIR and Square: making a static site support FAIR signposting". Martin shows how his static site emits a metadata.json file beside each post and advertises it through HTTP Link headers. The pattern is small, but powerful:

each page has a stable JSON-LD description;
the page says where that JSON-LD lives;
the server can advertise the same metadata before a machine agent even downloads the HTML;
cited sources can be represented as structured citation objects.

I started by implementing that pattern in three related but different Eleventy sites:

adamdjbrett.com, where the main unit is a blog post;
dominationchronicles.com, where the main unit is a podcast episode;
outcome.doctrineofdiscovery.org, where the archive mixes blog-style essays, scholarly articles, and podcast episodes.

Then I pushed the same pattern across a wider set of sites: personal blogs, organizational sites, Then and Now project sites, older Jekyll installations, Minimal Mistakes sites, publication archives, and small Eleventy sites. That second pass taught a different lesson. The pattern transfers cleanly, but the implementation should respect the generator and the content model. A blog post should emit BlogPosting. A podcast episode should emit PodcastEpisode. A journal article should emit ScholarlyArticle. A book-series page should probably emit Book. The shape of the metadata should follow the content, not force everything into one generic object.

What FAIR Signposting Adds #

FAIR stands for Findable, Accessible, Interoperable, and Reusable. On the web, one practical way to move toward FAIR is to make explicit links between a human-facing resource and its machine-readable metadata.

For a post or episode, I want the page to say:

who authored it;
how it should be cited;
what kind of thing it is;
where its structured metadata lives;
what license applies, if a license has been explicitly set.

In HTML, that means links like this:

<link rel="author" href="https://orcid.org/0009-0004-6725-8425">
<link rel="cite-as" href="https://www.adamdjbrett.com/blog/example-post/">
<link rel="type" href="https://schema.org/BlogPosting">
<link rel="license" href="https://creativecommons.org/licenses/by-nc/4.0/deed.en">
<link
  rel="describedby"
  type="application/ld+json"
  profile="https://schema.org/"
  href="/blog/example-post/metadata.json"
>

For a podcast episode, the same pattern becomes:

<link rel="author" href="https://stevennewcomb.com/#person">
<link rel="author" href="https://peterderrico.substack.com/#person">
<link rel="cite-as" href="https://dominationchronicles.com/episodes/e008-words-meanings/">
<link rel="type" href="https://schema.org/PodcastEpisode">
<link
  rel="describedby"
  type="application/ld+json"
  profile="https://schema.org/"
  href="/episodes/e008-words-meanings/metadata.json"
>

Notice the absence of a license link in the podcast example. That is intentional. I do not want a template making rights claims that the site has not explicitly made. On adamdjbrett.com, the blog metadata already has a Creative Commons license configured. On dominationchronicles.com, I am holding that field back until the content license is explicit.

The Blog Version: `BlogPosting` #

On adamdjbrett.com, the existing site already had strong blog metadata. The head template emitted Zotero tags, Dublin Core fields, and inline Schema.org BlogPosting JSON-LD. The problem was duplication. If I created a separate metadata.json template by copying the inline JSON-LD, the two versions would drift.

So I made the JSON-LD a shared Nunjucks partial.

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "@id": {{ (jsonld_url ~ "#article") | jsonify | safe }},
  "name": {{ jsonld_title | jsonify | safe }},
  "headline": {{ jsonld_title | jsonify | safe }},
  "description": {{ jsonld_description | jsonify | safe }},
  "url": {{ jsonld_url | jsonify | safe }},
  "inLanguage": {{ (metadata.blog.inLanguage or metadata.blog.inlanguage or metadata.language) | jsonify | safe }},
  "identifier": {{ jsonld_identifier | jsonify | safe }},
  "license": {{ (metadata.blog.license or metadata.footer.text1_link) | jsonify | safe }},
  "author": {
    "@type": "Person",
    "@id": {{ (metadata.url ~ "#person") | jsonify | safe }},
    "name": {{ metadata.author.name | jsonify | safe }},
    "url": {{ metadata.author.url | jsonify | safe }},
    "identifier": {{ metadata.blog.author.identifier | jsonify | safe }}
  },
  "isPartOf": {
    "@type": "Blog",
    "@id": {{ metadata.blog.url | jsonify | safe }},
    "name": {{ metadata.blog.name | jsonify | safe }},
    "url": {{ metadata.blog.url | jsonify | safe }}
  }{% if jsonld_citations and jsonld_citations.length %},
  "citation": {{ jsonld_citations | jsonify | safe }}{% endif %}
}

The @id for the blog itself is the clean blog URL:

"@id": "https://www.adamdjbrett.com/blog/"

I originally used https://www.adamdjbrett.com/blog/#blog, which works as a JSON-LD identifier, but looks awkward in the generated metadata. It was not necessary. The clean URL is easier to read and remains a stable identifier for the blog collection.

Then I created a Nunjucks template that paginates over collections.posts and writes one metadata.json file beside every post:

---
pagination:
  data: collections.posts
  size: 1
  alias: post
permalink: "{{ post.url }}metadata.json"
layout: null
eleventyExcludeFromCollections: true
---
{% set postData = post.data %}
{% set jsonld_title = postData.title %}
{% set jsonld_description = postData.description | metaDescription(postData.title or metadata.title, metadata.title, post.url) %}
{% set jsonld_url = postData.canonical_url or (metadata.url ~ post.url) %}
{% set jsonld_identifier = postData.doi or postData.canonical_url or (metadata.url ~ post.url) %}
{% set jsonld_image = ((postData.image or metadata.image) | url | absoluteUrl(metadata.url)) %}
{% set jsonld_date_published = post.date | htmlDateString %}
{% set jsonld_date_modified = post.date | htmlDateString %}
{% set jsonld_citations = postData.citations %}
{% include "widget/blog-post-jsonld.njk" %}

That produces URLs like:

/blog/zotero-harvestable-blog-posts/metadata.json
/blog/2026-06-29-on-being-in-the-middle/metadata.json

The same shared partial is also included inside the existing blog page head:

<script type="application/ld+json">
{% set jsonld_title = title %}
{% set jsonld_description = meta_description %}
{% set jsonld_url = page_url %}
{% set jsonld_identifier = doi or canonical_url or (metadata.url ~ page.url) %}
{% set jsonld_image = ((image or metadata.image) | url | absoluteUrl(metadata.url)) %}
{% set jsonld_date_published = page.date | htmlDateString %}
{% set jsonld_date_modified = page.date | htmlDateString %}
{% set jsonld_citations = citations %}
{% include "widget/blog-post-jsonld.njk" %}
</script>

That keeps the inline JSON-LD and the external metadata.json synchronized.

The Podcast Version: `PodcastEpisode` #

dominationchronicles.com needed the same infrastructure, but not the same Schema.org type.

The site already had a PodcastSeries in its JSON-LD:

{
  "@type": "PodcastSeries",
  "@id": "https://dominationchronicles.com/#podcast",
  "name": "The Domination Chronicles",
  "url": "https://dominationchronicles.com/"
}

That is the correct parent object for episode metadata. Each episode should point back to that series with partOfSeries.

The episode JSON-LD partial looks like this:

{
  "@context": "https://schema.org",
  "@type": "PodcastEpisode",
  "@id": {{ episode_jsonld_url | jsonify | safe }},
  "name": {{ episode_jsonld_title | jsonify | safe }},
  "headline": {{ episode_jsonld_title | jsonify | safe }},
  "description": {{ episode_jsonld_description | jsonify | safe }},
  "url": {{ episode_jsonld_url | jsonify | safe }},
  "inLanguage": {{ (metadata.language or "en") | jsonify | safe }},
  "datePublished": {{ episode_jsonld_date | jsonify | safe }},
  "duration": {{ episode_jsonld_duration | jsonify | safe }},
  "image": {{ episode_jsonld_image | jsonify | safe }},
  "author": [
    {
      "@type": "Person",
      "@id": "https://stevennewcomb.com/#person",
      "name": "Steven T. Newcomb",
      "url": "https://stevennewcomb.com/"
    },
    {
      "@type": "Person",
      "@id": "https://peterderrico.substack.com/#person",
      "name": "Peter d'Errico",
      "url": "https://peterderrico.substack.com/"
    }
  ],
  "publisher": {
    "@type": "Organization",
    "@id": {{ ((metadata.url or "https://dominationchronicles.com") ~ "/#org") | jsonify | safe }},
    "name": "The Domination Chronicles",
    "url": {{ ((metadata.url or "https://dominationchronicles.com") ~ "/") | jsonify | safe }}
  },
  "partOfSeries": {
    "@type": "PodcastSeries",
    "@id": {{ ((metadata.url or "https://dominationchronicles.com") ~ "/#podcast") | jsonify | safe }},
    "name": "The Domination Chronicles",
    "url": {{ ((metadata.url or "https://dominationchronicles.com") ~ "/") | jsonify | safe }}
  }{% if episode_jsonld_citations and episode_jsonld_citations.length %},
  "citation": {{ episode_jsonld_citations | jsonify | safe }}{% endif %}
}

Then a template paginates over collections.episodes:

---
pagination:
  data: collections.episodes
  size: 1
  alias: episodePost
permalink: "{{ episodePost.url }}metadata.json"
layout: null
eleventyExcludeFromCollections: true
---
{% set episodeData = episodePost.data %}
{% set episode_jsonld_title = episodeData.title %}
{% set episode_jsonld_description = episodeData.description %}
{% set episode_jsonld_url = metadata.url ~ episodePost.url %}
{% set episode_jsonld_date = episodeData.publishDate | xmlDate %}
{% set episode_jsonld_duration = episodeData.duration %}
{% set episode_jsonld_image = (episodeData.image or metadata.image) | absoluteUrl(metadata.url) %}
{% set episode_jsonld_audio = episodeData.audioUrl or episodeData.audio %}
{% set episode_jsonld_video = episodeData.videoUrl or (episodeData.videoId and ("https://www.youtube.com/watch?v=" ~ episodeData.videoId)) %}
{% set episode_jsonld_transcript = episodeData.transcriptUrl %}
{% set episode_jsonld_citations = episodeData.citations %}
{% include "partials/episode-jsonld.njk" %}

That produces:

/episodes/e008-words-meanings/metadata.json
/episodes/e017-bruce-mcivor-legalized-lawlessness/metadata.json

Those files describe episodes as episodes. That distinction matters. If everything is CreativeWork, machines can still read it, but the metadata is less precise. If an episode is a podcast episode, say so.

The Archive Version: Mixed Schema.org Types #

outcome.doctrineofdiscovery.org needed the same basic pattern, but it could not be described by one Schema.org type. It is an archive site with several kinds of publication:

JCRT, JCREOR, IJR, CrossCurrents, Compass, and Ecozoic pieces that should usually be ScholarlyArticle;
podcast pages that should be PodcastEpisode;
Canopy, Featured, and other web-native pieces that can be BlogPosting.

The first implementation used the site's existing posting collection. That worked for many pages, but it missed older podcast entries whose tags came from a different taxonomy. The fix was to create a collection whose membership is based on the source path and URL, not a legacy tag.

const fairEntryPrefixes = [
  "/canopy/",
  "/compass/",
  "/crosscurrents/",
  "/ecozoic/",
  "/featured/",
  "/ijr/",
  "/jcreor/",
  "/jcrt/",
  "/podcast/",
];
const excludedFairEntryBasenames = new Set(["abstract", "index", "toc"]);
const isFairEntry = (item) => {
  const url = String(item?.url || "");
  const inputPath = String(item?.inputPath || "");
  const basename = path.basename(inputPath, path.extname(inputPath));

  return Boolean(
    url &&
    inputPath.endsWith(".md") &&
    !excludedFairEntryBasenames.has(basename) &&
    fairEntryPrefixes.some(prefix => url.startsWith(prefix) && url !== prefix)
  );
};

eleventyConfig.addCollection("fairEntries", (collectionApi) =>
  sortPostingRecentItems(collectionApi.getAll().filter(isFairEntry))
);

The generated metadata.json template paginates over that collection and maps URL families to Schema.org types:

---
pagination:
  data: collections.fairEntries
  size: 1
  alias: entryPost
permalink: "{{ entryPost.url }}metadata.json"
layout: null
eleventyExcludeFromCollections: true
---
{% set entryData = entryPost.data %}
{% set academicFolders = ['/compass/', '/crosscurrents/', '/ecozoic/', '/ijr/', '/jcrt/', '/jcreor/'] %}
{% set entry_jsonld_type = "BlogPosting" %}
{% for f in academicFolders %}
  {% if entryPost.url.startsWith(f) and entryPost.url != f %}
    {% set entry_jsonld_type = "ScholarlyArticle" %}
  {% endif %}
{% endfor %}
{% if entryPost.url.startsWith('/podcast/') and entryPost.url != '/podcast/' %}
  {% set entry_jsonld_type = "PodcastEpisode" %}
{% endif %}
{% set entry_jsonld_url = entryData.canonical or entryData.canoncial or entryData.doi or (metadata.url ~ entryPost.url) %}
{% set entry_jsonld_title = entryData.title or metadata.title %}
{% set entry_jsonld_description = entryData.description or entryData.abstract or metadata.description %}
{% set entry_jsonld_citations = entryData.citations %}
{% include "widgets/entry-jsonld.njk" %}

The shared entry-jsonld.njk partial then conditionally adds the parent relationship that fits the type:

{% if entry_jsonld_type == "PodcastEpisode" %}
"partOfSeries": {
  "@type": "PodcastSeries",
  "@id": {{ (metadata.url ~ "/podcast/") | jsonify | safe }},
  "name": "Mapping the Doctrine of Discovery",
  "url": {{ (metadata.url ~ "/podcast/") | jsonify | safe }}
}
{% elseif entry_jsonld_type == "BlogPosting" %}
"isPartOf": {
  "@type": "Blog",
  "@id": {{ metadata.url | jsonify | safe }},
  "name": {{ metadata.title | jsonify | safe }},
  "url": {{ metadata.url | jsonify | safe }}
}
{% endif %}

That gives Outcome adjacent files such as:

/jcrt/issue2/conclusion/metadata.json
/featured/essay2/documenting/metadata.json
/podcast/essay1/s2e6/metadata.json

The important difference is that this site has one FAIR infrastructure but several publication identities. The code decides the type from the archive structure, then the shared partial emits the right shape.

The 11ty Pattern at Scale #

Once the first three sites worked, the useful question became: how much of this can be repeated across ordinary 11ty sites without turning every repo into a custom metadata project?

The answer is: quite a lot.

For sites such as ebenezer, davidbrett.im, stevennewcomb.com, indigenouslawinstitute.com, and alfieaward.com, the durable content units are mostly posts, resource pages, or press items. These sites do not all have the same theme, but they share enough 11ty structure to use the same pattern:

add a shared JSON-LD partial;
add a paginated metadata.json template;
add a root /metadata.json;
add FAIR links and Zotero tags to the head template;
generate _headers where the host can use it.

The reusable partial is deliberately conservative. It does not try to infer everything. It emits the fields that are stable in the local content model and front matter:

{%- set postData = post.data -%}
{%- set postUrl = metadata.url + post.url -%}
{%- set postTitle = postData.title or metadata.title -%}
{%- set postDescription = postData.description or postData.summary or metadata.description -%}
{%- set postImage = postData.image or metadata.image -%}
{
  "@context": "https://schema.org",
  "@type": "{{ jsonldType or "BlogPosting" }}",
  "@id": {{ (postUrl + "#metadata") | dump | safe }},
  "name": {{ postTitle | dump | safe }},
  "headline": {{ postTitle | dump | safe }},
  "description": {{ postDescription | dump | safe }},
  "url": {{ postUrl | dump | safe }},
  "identifier": {{ postUrl | dump | safe }},
  "inLanguage": {{ metadata.language | dump | safe }},
  "datePublished": {{ post.date | htmlDateString | dump | safe }},
  "dateModified": {{ (postData.updated or postData.modified or post.date) | htmlDateString | dump | safe }},
  "author": {
    "@type": "Person",
    "name": {{ metadata.author.name | dump | safe }},
    "url": {{ metadata.author.url | dump | safe }}
  },
  "publisher": {
    "@type": "Organization",
    "name": {{ metadata.title | dump | safe }},
    "url": {{ metadata.url | dump | safe }}
  },
  "isPartOf": {
    "@type": "Blog",
    "@id": {{ (metadata.url + "/#blog") | dump | safe }},
    "name": {{ metadata.title | dump | safe }},
    "url": {{ metadata.url | dump | safe }}
  }{% if postImage %},
  "image": {{ ((metadata.url + postImage) if "/" == postImage[0] else postImage) | dump | safe }}{% endif %}{% if postData.citations %},
  "citation": {{ postData.citations | dump | safe }}{% endif %}
}

The dump filter is useful here because it serializes strings, arrays, and objects into JSON safely. The point is not to hand-build quoted strings in a JSON file. Let the template engine serialize the value.

The adjacent metadata template then becomes tiny:

---
pagination:
  data: collections.posts
  size: 1
  alias: post
permalink: "{{ post.url }}metadata.json"
layout: false
eleventyExcludeFromCollections: true
---
{% include "partials/post-jsonld.njk" %}

For a press section, the same partial can emit NewsArticle instead of BlogPosting:

---
pagination:
  data: collections.press
  size: 1
  alias: post
permalink: "{{ post.url }}metadata.json"
layout: false
eleventyExcludeFromCollections: true
---
{% set jsonldType = "NewsArticle" %}
{% include "partials/post-jsonld.njk" %}

This is what I used on the Alfie Award site, where the citable items include both blog posts and press releases. The site does not need an elaborate abstraction. It needs one partial and two paginated metadata templates.

The root site metadata is separate. It describes the site, not an individual post:

---
permalink: /metadata.json
layout: false
eleventyExcludeFromCollections: true
---
{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "@id": {{ (metadata.url + "/#website") | dump | safe }},
  "name": {{ metadata.title | dump | safe }},
  "description": {{ metadata.description | dump | safe }},
  "url": {{ metadata.url | dump | safe }},
  "inLanguage": {{ metadata.language | dump | safe }},
  "publisher": {
    "@type": "Organization",
    "name": {{ metadata.title | dump | safe }},
    "url": {{ metadata.url | dump | safe }}
  }
}

The head template then advertises the adjacent metadata file and gives Zotero enough information to import the item correctly:

{% set isPost = layout == 'post' or "posts" in (tags or []) %}
{% if isPost %}
{% set pageCanonical = canonical_url or metadata.url + page.url %}
<link rel="cite-as" href="{{ pageCanonical }}">
<link rel="type" href="https://schema.org/BlogPosting">
<link rel="author" href="{{ metadata.author.url or metadata.url }}">
<link rel="describedby" type="application/ld+json" profile="https://schema.org/" href="{{ page.url }}metadata.json">
<meta property="zotero:itemType" content="blogPost">
<meta property="dc:title" content="{{ (title or metadata.title) | escape }}">
<meta property="dc:creator" content="{{ metadata.author.name | escape }}">
{% if date %}<meta property="dc:date" content="{{ date | htmlDateString }}">{% endif %}
<meta property="dc:language" content="{{ metadata.language }}">
<meta property="dc:identifier" content="{{ pageCanonical }}">
<meta name="citation_title" content="{{ (title or metadata.title) | escape }}">
<meta name="citation_author" content="{{ metadata.author.name | escape }}">
{% if date %}<meta name="citation_publication_date" content="{{ date | htmlDateString }}">{% endif %}
<meta name="citation_fulltext_html_url" content="{{ pageCanonical }}">
<meta name="citation_public_url" content="{{ pageCanonical }}">
<meta name="citation_language" content="{{ metadata.language }}">
<meta name="prism.genre" content="blogentry">
<meta name="prism.publicationName" content="{{ metadata.title | escape }}">
{% endif %}

One small bug surfaced while doing this on Steven Newcomb's site. The original condition checked only layout == 'post', but that site's actual rendered posts used post.njk. The result was that the generated metadata files existed, but the HTML pages did not advertise them. The fix was to make the condition match the site's real layout and tag patterns:

{% set isPost = layout == 'post' or layout == 'post.njk' or "posts" in (tags or []) %}

This is a good example of the larger rule: do not assume all Eleventy sites use the same layout names. Check the local collection and layout conventions before deciding where the metadata should appear.

The _headers template follows the same collection:

---
permalink: /_headers
layout: false
eleventyExcludeFromCollections: true
---
/metadata.json
  Content-Type: application/ld+json

{% for post in collections.posts -%}
{{ post.url }}
  Link: <{{ post.url }}metadata.json>; rel="describedby"; type="application/ld+json"; profile="https://schema.org/", <{{ post.data.canonical_url or metadata.url + post.url }}>; rel="cite-as", <https://schema.org/BlogPosting>; rel="type"

{{ post.url }}metadata.json
  Content-Type: application/ld+json

{% endfor -%}

Again, this is not the only discovery path. The HTML describedby link is still the guaranteed path. The header is an extra affordance for hosts that support it.

The 11ty implementation pattern is now clear enough that I can almost treat it as a checklist:

identify the citable collections;
make one shared JSON-LD partial per content family;
paginate over each collection to produce adjacent metadata.json;
advertise the file from the rendered page head;
emit Zotero tags in the same condition;
generate _headers from the same collection;
build and parse every generated metadata.json file.

That last step matters. For the latest pass, I parsed 149 generated Eleventy metadata.json files after the build. It is a cheap test and it catches malformed JSON immediately.

The Jekyll and Minimal Mistakes Version #

Jekyll needs a different implementation. There is no Nunjucks pagination template. The cleanest approach is a small generator plugin that creates a metadata.json page beside each post.

For the Then and Now sites, Quotidian, NABPR, and WOCATI, the generator follows the same structure:

require "json"

module FairMetadataJson
  class MetadataPage < Jekyll::PageWithoutAFile
    def initialize(site, dir, doc, schema_type)
      @site = site
      @base = site.source
      @dir = dir
      @name = "metadata.json"
      process(@name)
      self.data = { "layout" => nil, "sitemap" => false }
      self.content = JSON.pretty_generate(metadata_for(site, doc, schema_type))
    end

    private

    def metadata_for(site, doc, schema_type)
      site_url = site.config["url"].to_s.sub(%r{/$}, "")
      title = doc.data["title"] || site.config["title"] || site.config["name"]
      description = doc.data["description"] ||
        doc.data["excerpt"] ||
        doc.output.to_s.gsub(/<[^>]+>/, " ").squeeze(" ").strip[0, 300]
      url = absolute_url(site_url, doc.url)
      author = doc.data["author"] ||
        site.config.dig("author", "name") ||
        site.config["author"] ||
        site.config["name"] ||
        site.config["title"]
      language = (site.config["locale"] || site.config["language"] || "en").to_s[0, 2]
      image = doc.data["image"] ||
        doc.data.dig("header", "image") ||
        doc.data.dig("header", "teaser") ||
        site.config["teaser"] ||
        site.config["logo"]

      data = {
        "@context" => "https://schema.org",
        "@type" => schema_type,
        "@id" => "#{url}#metadata",
        "name" => title,
        "headline" => title,
        "description" => description,
        "url" => url,
        "identifier" => url,
        "inLanguage" => language,
        "author" => { "@type" => "Person", "name" => author.to_s },
        "publisher" => {
          "@type" => "Organization",
          "name" => (site.config["title"] || site.config["name"]).to_s,
          "url" => site_url
        },
        "isPartOf" => {
          "@type" => "WebSite",
          "@id" => "#{site_url}/#website",
          "name" => (site.config["title"] || site.config["name"]).to_s,
          "url" => site_url
        }
      }

      data["datePublished"] = doc.date.strftime("%Y-%m-%d") if doc.respond_to?(:date) && doc.date
      data["dateModified"] = doc.data["last_modified_at"].to_s if doc.data["last_modified_at"]
      data["image"] = absolute_url(site_url, image) if image
      data["citation"] = doc.data["citations"] if doc.data["citations"]
      data
    end

    def absolute_url(site_url, value)
      return nil unless value
      value = value.to_s
      return value if value.start_with?("http://", "https://")
      "#{site_url}#{value.start_with?("/") ? value : "/#{value}"}"
    end
  end

  class Generator < Jekyll::Generator
    safe true
    priority :low

    def generate(site)
      site.posts.docs.each do |doc|
        dir = doc.url.end_with?("/") ? doc.url : "#{doc.url}/"
        site.pages << MetadataPage.new(site, dir, doc, "BlogPosting")
      end
    end
  end
end

This is enough for ordinary Jekyll posts. If a post lives at:

/2015/07/religion-materiality-and-machines/

the plugin adds:

/2015/07/religion-materiality-and-machines/metadata.json

The PBI NABPR book-series site needed one variation. It does not primarily publish posts. Its durable citation targets are _books. So the generator also loops over the books collection and emits Book metadata:

Array(site.collections["books"]&.docs).each do |doc|
  dir = doc.url.end_with?("/") ? doc.url : "#{doc.url}/"
  site.pages << MetadataPage.new(site, dir, doc, "Book")
end

That is the same lesson again: the metadata type follows the content type. A book-series page should not pretend to be a blog post merely because that was the first pattern I implemented.

The Jekyll head include is simpler than the generator. In Minimal Mistakes sites, I used _includes/head/custom.html to include a Zotero/signposting partial:

{% include head/zotero.html %}

For non-Minimal Mistakes Jekyll sites, the include goes wherever the site already centralizes head metadata. Quotidian uses _includes/head-seo.html. NABPR uses _includes/_head.html.

The partial itself follows the same shape as the Eleventy head code, but in Liquid:

{% assign zotero_is_post = false %}
{% if page.collection == "posts" %}
  {% assign zotero_is_post = true %}
{% endif %}
{% if zotero_is_post and page.published != false %}
{% assign zotero_author_name = page.author | default: site.author.name | default: site.author | default: site.title %}
{% assign zotero_date = page.date | default: page.last_modified_at %}
{% assign zotero_metadata_href = page.url %}
{% assign zotero_metadata_last = zotero_metadata_href | slice: -1, 1 %}
{% unless zotero_metadata_last == "/" %}
  {% assign zotero_metadata_href = zotero_metadata_href | append: "/" %}
{% endunless %}
<link rel="cite-as" href="{{ page.url | absolute_url }}">
<link rel="type" href="https://schema.org/BlogPosting">
<link rel="author" href="{{ site.url }}">
<link rel="describedby" type="application/ld+json" profile="https://schema.org/" href="{{ zotero_metadata_href | append: 'metadata.json' | relative_url }}">
<meta property="zotero:itemType" content="blogPost">
<meta property="dc:title" content="{{ page.title | markdownify | strip_html | strip_newlines | escape_once }}">
<meta property="dc:creator" content="{{ zotero_author_name | strip | escape_once }}">
{% if zotero_date %}<meta property="dc:date" content="{{ zotero_date | date: '%Y-%m-%d' }}">{% endif %}
<meta property="dc:language" content="{{ site.locale | slice: 0,2 | default: 'en' }}">
<meta property="dc:identifier" content="{{ page.url | absolute_url }}">
<meta name="citation_title" content="{{ page.title | markdownify | strip_html | strip_newlines | escape_once }}">
<meta name="citation_author" content="{{ zotero_author_name | strip | escape_once }}">
{% if zotero_date %}<meta name="citation_publication_date" content="{{ zotero_date | date: '%Y-%m-%d' }}">{% endif %}
<meta name="citation_fulltext_html_url" content="{{ page.url | absolute_url }}">
<meta name="citation_public_url" content="{{ page.url | absolute_url }}">
<meta name="citation_language" content="{{ site.locale | slice: 0,2 | default: 'en' }}">
{% if page.doi %}<meta name="citation_doi" content="{{ page.doi | escape_once }}">{% endif %}
<meta name="prism.genre" content="blogentry">
<meta name="prism.publicationName" content="{{ site.title | escape_once }}">
{% endif %}

For PBI book pages, the same file can add a book branch:

{% if page.collection == "books" and page.published != false %}
{% assign zotero_author_name = page.author | default: page.book_author | default: site.name %}
{% assign zotero_metadata_href = page.url %}
{% assign zotero_metadata_last = zotero_metadata_href | slice: -1, 1 %}
{% unless zotero_metadata_last == "/" %}
  {% assign zotero_metadata_href = zotero_metadata_href | append: "/" %}
{% endunless %}
<link rel="cite-as" href="{{ page.url | absolute_url }}">
<link rel="type" href="https://schema.org/Book">
<link rel="author" href="{{ site.url }}">
<link rel="describedby" type="application/ld+json" profile="https://schema.org/" href="{{ zotero_metadata_href | append: 'metadata.json' | relative_url }}">
<meta property="zotero:itemType" content="book">
<meta property="dc:title" content="{{ page.title | markdownify | strip_html | strip_newlines | escape_once }}">
<meta property="dc:creator" content="{{ zotero_author_name | strip | escape_once }}">
<meta name="citation_title" content="{{ page.title | markdownify | strip_html | strip_newlines | escape_once }}">
<meta name="citation_author" content="{{ zotero_author_name | strip | escape_once }}">
<meta name="citation_fulltext_html_url" content="{{ page.url | absolute_url }}">
<meta name="citation_public_url" content="{{ page.url | absolute_url }}">
<meta name="prism.genre" content="book">
<meta name="prism.publicationName" content="Perspectives on Baptist Identities">
{% endif %}

I also added root-level metadata.json files to the Jekyll sites. These are Liquid templates, not generated pages:

---
layout: null
sitemap: false
---
{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "@id": {{ site.url | append: "/#website" | jsonify }},
  "name": {{ site.title | default: site.name | jsonify }},
  "description": {{ site.description | strip_newlines | jsonify }},
  "url": {{ site.url | jsonify }},
  "inLanguage": {{ site.locale | slice: 0, 2 | default: "en" | jsonify }},
  "publisher": {
    "@type": "Organization",
    "name": {{ site.title | default: site.name | jsonify }},
    "url": {{ site.url | jsonify }}
  }
}

Jekyll verification was less smooth than Eleventy verification. The custom Ruby plugins all passed:

ruby -c _plugins/fair_metadata_json.rb

But several full Jekyll builds were blocked locally by Ruby and Bundler state: missing Bundler 4.0.11, missing local Jekyll gems, and rake version conflicts. That is not unusual with older Jekyll repos, but it changes what can be honestly claimed. In those cases I verified plugin syntax and head-template wiring, and left full build verification for an environment with the repo's bundle installed cleanly.

Adding Signposting Links in the Page Head #

For adamdjbrett.com, the blog SEO partial adds signposting only when the current page is a blog post:

{% if page.url and page.url.startsWith("/blog/") and page.url != "/blog/" %}
<link rel="author" href="{{ metadata.blog.author.identifier }}">
<link rel="cite-as" href="{{ page_url }}">
<link rel="type" href="https://schema.org/BlogPosting">
<link rel="license" href="{{ metadata.blog.license }}">
<link rel="describedby" type="application/ld+json" profile="https://schema.org/" href="{{ page.url }}metadata.json">
{% endif %}

For dominationchronicles.com, the SEO partial already knows when a page is an episode:

{% set isEpisode = "episodes" in (tags or []) %}

So the episode signposting lives inside that check:

{% if isEpisode %}
<link rel="author" href="https://stevennewcomb.com/#person" />
<link rel="author" href="https://peterderrico.substack.com/#person" />
<link rel="cite-as" href="{{ canonicalUrl }}" />
<link rel="type" href="https://schema.org/PodcastEpisode" />
{% if metadata.license %}
<link rel="license" href="{{ metadata.license }}" />
{% endif %}
<link rel="describedby" type="application/ld+json" profile="https://schema.org/" href="{{ page.url }}metadata.json" />
{% endif %}

For Outcome, the head template uses the same folder list as the collection so the HTML and generated files stay aligned:

{% set fairFolders = ['/canopy/', '/compass/', '/crosscurrents/', '/ecozoic/', '/featured/', '/ijr/', '/jcreor/', '/jcrt/', '/podcast/'] %}
{% set academicFolders = ['/compass/', '/crosscurrents/', '/ecozoic/', '/ijr/', '/jcrt/', '/jcreor/'] %}
{% set isFairEntry = false %}
{% for f in fairFolders %}
  {% if page.url.startsWith(f) and page.url != f %}
    {% set isFairEntry = true %}
  {% endif %}
{% endfor %}
{% set fairType = "BlogPosting" %}
{% for f in academicFolders %}
  {% if page.url.startsWith(f) and page.url != f %}
    {% set fairType = "ScholarlyArticle" %}
  {% endif %}
{% endfor %}
{% if page.url.startsWith('/podcast/') and page.url != '/podcast/' %}
  {% set fairType = "PodcastEpisode" %}
{% endif %}
{% if isFairEntry %}
<link rel="cite-as" href="{{ fairCanonical }}">
<link rel="type" href="https://schema.org/{{ fairType }}">
<link rel="license" href="{{ metadata.footer.licency.text2_link }}">
<link rel="describedby" type="application/ld+json" profile="https://schema.org/" href="{{ page.url }}metadata.json">
{% endif %}

This is the part that gives tools a path from the HTML page to the adjacent machine-readable file.

Headers: Useful, but Host-Dependent #

Martin's post uses Apache .htaccess files to emit HTTP Link headers per post directory. That works well when Apache honors per-directory .htaccess.

These sites are deployed differently.

adamdjbrett.com deploys through Netlify, so I can generate a Netlify-style _headers file:

---
permalink: /_headers
layout: null
eleventyExcludeFromCollections: true
---
{% for post in collections.posts %}
{{ post.url }}
  Link: <{{ metadata.blog.author.identifier }}>; rel="author", <{{ post.data.canonical_url or (metadata.url ~ post.url) }}>; rel="cite-as", <https://schema.org/BlogPosting>; rel="type", <{{ metadata.blog.license }}>; rel="license", <{{ metadata.url }}{{ post.url }}metadata.json>; rel="describedby"; type="application/ld+json"; profile="https://schema.org/"

{{ post.url }}metadata.json
  Content-Type: application/ld+json

{% endfor %}

dominationchronicles.com deploys through XMIT. I do not want to assume XMIT supports Netlify's _headers format. Still, generating _headers is harmless as a best-effort artifact, and the HTML <link> elements remain the guaranteed signposting path.

The episode version is:

---
permalink: /_headers
layout: null
eleventyExcludeFromCollections: true
---
{% for episodePost in collections.episodes %}
{{ episodePost.url }}
  Link: <https://stevennewcomb.com/#person>; rel="author", <https://peterderrico.substack.com/#person>; rel="author", <{{ metadata.url }}{{ episodePost.url }}>; rel="cite-as", <https://schema.org/PodcastEpisode>; rel="type", <{{ metadata.url }}{{ episodePost.url }}metadata.json>; rel="describedby"; type="application/ld+json"; profile="https://schema.org/"

{{ episodePost.url }}metadata.json
  Content-Type: application/ld+json

{% endfor %}

If the host supports _headers, machines can discover the metadata from HTTP headers. If not, they can still discover it from the HTML.

Outcome deploys through Netlify, so the _headers file is part of the build output. The template loops over the same fairEntries collection used for metadata.json:

---
permalink: /_headers
layout: null
eleventyExcludeFromCollections: true
---
{% set academicFolders = ['/compass/', '/crosscurrents/', '/ecozoic/', '/ijr/', '/jcrt/', '/jcreor/'] %}
{% for entryPost in collections.fairEntries %}{% if entryPost.url %}
{% set entryType = "BlogPosting" %}
{% for f in academicFolders %}
  {% if entryPost.url.startsWith(f) and entryPost.url != f %}
    {% set entryType = "ScholarlyArticle" %}
  {% endif %}
{% endfor %}
{% if entryPost.url.startsWith('/podcast/') and entryPost.url != '/podcast/' %}
  {% set entryType = "PodcastEpisode" %}
{% endif %}
{% set citeAs = entryPost.data.canonical or entryPost.data.canoncial or entryPost.data.doi or (metadata.url ~ entryPost.url) %}
{{ entryPost.url }}
  Link: <{{ citeAs }}>; rel="cite-as", <https://schema.org/{{ entryType }}>; rel="type", <{{ metadata.footer.licency.text2_link }}>; rel="license", <{{ metadata.url }}{{ entryPost.url }}metadata.json>; rel="describedby"; type="application/ld+json"; profile="https://schema.org/"

{{ entryPost.url }}metadata.json
  Content-Type: application/ld+json

{% endif %}{% endfor %}

Citation Automation #

The next layer is citation data.

Eve's post shows a useful pattern where front matter can include references and the generated metadata.json includes a Schema.org citation array. That matters because it turns a post from an isolated object into part of a citation network.

For adamdjbrett.com, the source field is:

references:
  - https://doi.org/10.59348/vrt01-f3b49
  - type: BlogPosting
    title: "Zotero-Nikola Harmony (One Simple Trick)"
    url: "https://paregorios.org/posts/2018/05/zotero_nikola_harmony/"
    author:
      name: Tom Elliott
      orcid: "https://orcid.org/0000-0002-4114-6677"
    date: 2018-05-08
    language: en

The generated field is:

citations:
  - '@type': ScholarlyArticle
    '@id': https://doi.org/10.59348/vrt01-f3b49
    name: Making blog posts harvestable by Zotero and preserving case in citation fields
    url: https://doi.org/10.59348/vrt01-f3b49

The helper script supports:

DOI URLs;
self-cites to local posts;
arbitrary web URLs;
manual override objects.

The basic resolver looks like this:

async function resolveReference(ref, localPosts) {
  if (ref && typeof ref === "object" && !Array.isArray(ref)) {
    return normalizeManualReference(ref);
  }

  if (typeof ref !== "string") {
    return null;
  }

  const url = ref.split(/\s+#/)[0].trim();
  if (!url) return null;

  if (doiFromUrl(url)) {
    return citationFromDoi(url);
  }

  const local = citationFromLocalPost(url, localPosts);
  if (local) {
    return local;
  }

  try {
    return await citationFromWebPage(url);
  } catch {
    return {
      "@type": "WebPage",
      "@id": url,
      name: url,
      url
    };
  }
}

For dominationchronicles.com, there was already a citation generator:

"citations": "node scripts/generate-citations.cjs"

That existing script writes RIS and CSL JSON files into public/citations. I did not want to replace it. Instead, the FAIR citation helper focuses only on structured citation data for JSON-LD.

It uses this input order:

use front matter references if present;
otherwise scan the episode body's Resources section for URLs;
write generated Schema.org objects back to citations only when explicitly run.

The resource scanner is intentionally simple:

function extractResourcesSection(body) {
  const start = body.search(/^##+\s+\*{0,2}Resources\*{0,2}\s*:?\s*$/im);
  if (start === -1) {
    return "";
  }

  const section = body.slice(start);
  const nextHeading = section.slice(1).search(/^##+\s+/m);
  return nextHeading === -1 ? section : section.slice(0, nextHeading + 1);
}

function extractUrlsFromResources(body) {
  const resources = extractResourcesSection(body);
  if (!resources) return [];

  const urls = [];
  const seen = new Set();

  for (const match of resources.matchAll(/https?:\/\/[^\s<>)\]]+/g)) {
    const url = match[0].replace(/[.,;:!?]+$/g, "");
    if (!seen.has(url)) {
      seen.add(url);
      urls.push(url);
    }
  }

  return urls;
}

That means existing episode resource lists can become structured citation metadata without rewriting every episode by hand.

Outcome uses the same idea, but broadens it to an archive. The helper is scripts/add-entry-citations.mjs, with this input order:

use front matter references when it exists;
otherwise use known citation-like front matter such as doi, canonical, canoncial, or pdf;
otherwise fall back to local how_to_cite text when available.

That last fallback is less rich than DOI or web metadata, but it is useful for an archive where many pages already contain human-readable citation guidance.

The npm scripts are explicit:

{
  "citations:fair": "node scripts/add-entry-citations.mjs",
  "citations:fair:check": "node scripts/add-entry-citations.mjs --dry-run"
}

The dry run is especially important on Outcome because the script can touch a lot of files. In the first check it found 171 entries with citation-like inputs that could be normalized into citations.

The dry-run command reports what it would update:

npm run citations:fair:check

And the write command is explicit:

npm run citations:fair

That separation is important. Metadata scripts should be inspectable before they rewrite editorial files.

Build Verification #

For adamdjbrett.com, I verified:

npm run build:fast

Then checked that representative files existed and parsed:

node -e "const fs=require('fs'); for (const f of ['_site/blog/zotero-harvestable-blog-posts/metadata.json','_site/blog/2026-06-29-on-being-in-the-middle/metadata.json']) { const d=JSON.parse(fs.readFileSync(f,'utf8')); console.log(f, d['@type'], d.name, d.isPartOf && d.isPartOf['@id']); }"

Expected output shape:

_site/blog/zotero-harvestable-blog-posts/metadata.json BlogPosting Making Blogs Easier to Cite with Zotero https://www.adamdjbrett.com/blog/
_site/blog/2026-06-29-on-being-in-the-middle/metadata.json BlogPosting Book Review: *On Being in the Middle* by W.J. de Kock https://www.adamdjbrett.com/blog/

For dominationchronicles.com, I verified:

npm run build

Then checked:

node -e "const fs=require('fs'); for (const f of ['_site/episodes/e008-words-meanings/metadata.json','_site/episodes/e017-bruce-mcivor-legalized-lawlessness/metadata.json']) { const d=JSON.parse(fs.readFileSync(f,'utf8')); console.log(f, d['@type'], d.url, d.partOfSeries && d.partOfSeries['@id']); }"

Expected output shape:

_site/episodes/e008-words-meanings/metadata.json PodcastEpisode https://dominationchronicles.com/episodes/e008-words-meanings/ https://dominationchronicles.com/#podcast
_site/episodes/e017-bruce-mcivor-legalized-lawlessness/metadata.json PodcastEpisode https://dominationchronicles.com/episodes/e017-bruce-mcivor-legalized-lawlessness/ https://dominationchronicles.com/#podcast

For Outcome, I verified:

npm run build-prod-fastest

Then checked one scholarly article, one blog-style entry, and one older podcast entry:

node -e "const fs=require('fs'); for (const p of ['_site/jcrt/issue2/conclusion/metadata.json','_site/featured/essay2/documenting/metadata.json','_site/podcast/essay1/s2e6/metadata.json']) { const j=JSON.parse(fs.readFileSync(p,'utf8')); console.log(p, j['@type'], j.url, Boolean(j.author), Boolean(j.citation), j.partOfSeries && j.partOfSeries['@id'], j.isPartOf && j.isPartOf['@id']); }"

The output confirmed the mixed-type model:

_site/jcrt/issue2/conclusion/metadata.json ScholarlyArticle https://jcrt.org/archives/25.1/conclusion/ true true undefined undefined
_site/featured/essay2/documenting/metadata.json BlogPosting https://www.mdpi.com/2077-1444/15/12/1493 true true undefined undefined
_site/podcast/essay1/s2e6/metadata.json PodcastEpisode https://podcast.doctrineofdiscovery.org/season2/episode-06/ true true https://outcome.doctrineofdiscovery.org/podcast/ undefined

I also checked the rendered HTML for signposting links and the generated _headers files.

For the broader 11ty pass, I ran the normal build command in each local repo:

cd /Users/abrett76/github/ebenezer && npm run build
cd /Users/abrett76/github/davidbrett.im && npm run build
cd /Users/abrett76/github/stevennewcomb && npm run build
cd /Users/abrett76/github/indigenouslawinstitute.com && npm run build
cd /Users/abrett76/github/alfieaward.com && npm run build

Then I parsed every generated metadata.json file from those five sites:

node -e "const fs=require('fs'), cp=require('child_process'); const files=cp.execSync('find ebenezer/_site davidbrett.im/_site stevennewcomb/_site indigenouslawinstitute.com/_site alfieaward.com/_site -path \\'*/metadata.json\\'', {encoding:'utf8'}).trim().split(/\\n/).filter(Boolean); for (const f of files) JSON.parse(fs.readFileSync(f,'utf8')); console.log('Parsed '+files.length+' Eleventy metadata.json files');"

That produced:

Parsed 149 Eleventy metadata.json files

I also checked representative rendered HTML for rel="describedby", rel="type", and citation_title. One useful catch came from stevennewcomb.com: the metadata files were generated, but the first head condition missed the rendered posts because the site used post.njk instead of post. That is why rendered HTML checks matter. File generation alone is not enough. The human-facing page has to advertise the machine-facing file.

For the Jekyll sites, I verified the generator plugins with Ruby syntax checks:

for f in \
  thenandnow-new/_plugins/fair_metadata_json.rb \
  derryveagh-new/_plugins/fair_metadata_json.rb \
  stakedplains-new/_plugins/fair_metadata_json.rb \
  quotidian/_plugins/fair_metadata_json.rb \
  nabpr/_plugins/fair_metadata_json.rb \
  nabpr-pbi2/_plugins/fair_metadata_json.rb \
  wocati-new/_plugins/fair_metadata_json.rb
do
  ruby -c "$f"
done

All seven returned Syntax OK. Full Jekyll builds were blocked in my local environment by dependency state rather than by the plugin syntax. The errors included missing Bundler 4.0.11, missing local jekyll gems, and rake activation conflicts. That is the kind of infrastructure detail I want to preserve in the implementation notes because it tells the truth about old static-site repos: sometimes the metadata work is straightforward and the build environment is the hard part.

Why This Matters #

This work is not about SEO in the narrow sense. It might help search engines, but that is not the main point.

The real point is scholarly portability.

A blog post about Zotero metadata should itself be easy to cite. A podcast episode with a transcript and a resource list should be discoverable as a podcast episode, not just another web page. A book review, article announcement, or episode resource list should have enough structured metadata that a researcher or archive can understand what it is.

Static sites do not get this automatically. That is the tradeoff. They are durable and transparent, but the metadata affordances have to be built deliberately.

The pattern I want to keep using is:

write human-facing pages in Markdown;
keep stable site metadata in Eleventy data files;
generate page-adjacent metadata.json;
advertise that metadata from HTML and headers where possible;
keep citation data in front matter as structured JSON-LD-ready objects;
verify the generated files as part of the build workflow.

That is not a large system. It is a modest layer of structured metadata. But it makes independent scholarly publishing more legible to the rest of the web.

And that is the part I care about most: if we want small sites, independent publications, podcasts, and personal scholarly blogs to last, we have to make them easier to cite, easier to harvest, and easier to preserve.

Tags : eleventy jekyll metadata fair zotero digital-humanities web-development

Webmentions

No webmentions yet.

ADAM DJ BRETT

Making Static Sites FAIR with metadata.json, Zotero, Eleventy, and Jekyll

What FAIR Signposting Adds #

The Blog Version: `BlogPosting` #

The Podcast Version: `PodcastEpisode` #

The Archive Version: Mixed Schema.org Types #

The 11ty Pattern at Scale #

The Jekyll and Minimal Mistakes Version #

Adding Signposting Links in the Page Head #

Headers: Useful, but Host-Dependent #

Citation Automation #

Build Verification #

Why This Matters #

Webmentions

How Build Awesome alpha, async Nunjucks, and smaller Cloudflare checks fixed my podcast build

ADAM DJ BRETT

What FAIR Signposting Adds #

The Blog Version: BlogPosting #

The Podcast Version: PodcastEpisode #

The Archive Version: Mixed Schema.org Types #

The 11ty Pattern at Scale #

The Jekyll and Minimal Mistakes Version #

Adding Signposting Links in the Page Head #

Headers: Useful, but Host-Dependent #

Citation Automation #

Build Verification #

Why This Matters #

Webmentions

How Build Awesome alpha, async Nunjucks, and smaller Cloudflare checks fixed my podcast build

The Blog Version: `BlogPosting` #

The Podcast Version: `PodcastEpisode` #