~ / ..

2MB Lighter

Building my third Jekyll gem
a diary by niko

The mermaid.js chart library weighs in at ~2MB minified. I wanted diagrams in my blog posts, but not at that cost. The solution became my third Jekyll gem: jekyll-mermaid-prebuild.

The Setup

I had a blog post with several Mermaid diagrams explaining a firmware debugging story. The standard approach - include mermaid.js and let it render client-side - would add 2MB to every page load. For static diagrams that never change after publishing, that’s absurd.

The mermaid project provides mmdc, a CLI that renders diagrams to SVG using headless Chrome. A Jekyll plugin could intercept mermaid code blocks during build, shell out to mmdc, and replace them with static SVG references. No client-side JavaScript needed.

Prototype in _plugins/

Following the pattern from jekyll-auto-thumbnails, I started with a local plugin before extracting to a gem. Faster iteration, immediate feedback.

The first attempt operated on rendered HTML, scanning for <code class="language-mermaid"> blocks. This worked, but produced inline SVG data URIs - ugly in the output and not separately cacheable.

Better approach: operate on markdown during :pre_render. Find mermaid code blocks, convert to SVG files, replace with image references. The SVGs become proper static assets - cacheable by browsers, clickable for full-size viewing.

Puppeteer in WSL

First test run:

MermaidPrebuild: Initialized (mmdc 11.12.0)
MermaidPrebuild: mmdc failed: error while loading shared libraries: libgbm.so.1

The mermaid CLI uses Puppeteer (headless Chrome) internally. WSL - my local build environment - doesn’t ship with Chrome’s system library dependencies.

sudo apt-get install -y libgbm1 libasound2 libatk1.0-0 \
  libatk-bridge2.0-0 libcups2 libdrm2 libxcomposite1 \
  libxdamage1 libxfixes3 libxrandr2 libxkbcommon0 \
  libpango-1.0-0 libcairo2 libnss3 libnspr4

After installing the dependencies, mmdc worked. The plugin detects this failure mode and prints the apt-get command in the error message.

The Hook Timing Bug

With Puppeteer working, the plugin… did nothing. No diagrams converted. Debug output showed the :pre_render hook firing, but site.data["mermaid_prebuild_enabled"] was nil.

Jekyll’s :after_init hook seemed like the right place to check if mmdc exists and store the result. But site.data doesn’t persist between hooks the way I expected. By the time :pre_render runs on each document, the flag was gone.

The fix: use :post_read instead of :after_init. At that point, site configuration and data are stable.

Jekyll::Hooks.register :site, :post_read do |site|
  next unless Configuration.enabled?(site)
  
  site.data["mermaid_prebuild"] = { "enabled" => true, "registry" => {} }
  Jekyll.logger.info "MermaidPrebuild:", "Initialized (mmdc #{MmdcWrapper.version})"
end

Code Fence Patterns

Markdown supports two fence styles. Backticks:

```mermaid
graph LR
  A --> B
```

Or tildes:

~~~mermaid
graph LR
  A --> B
~~~

Both can use 3+ fence characters. The plugin needed to handle either style and ensure the closing fence matches the opening fence in both character type and count.

The regex:

%r{
  ^(`{3,}|~{3,})mermaid\s*\n  # Opening: 3+ backticks or tildes
  (.*?)                        # Content (non-greedy)
  ^\1\s*$                      # Closing: must match opener
}mx

The \1 backreference ensures a block opened with four backticks closes with four backticks, not three tildes.

To show these examples without converting them, the plugin respects fence nesting - mermaid blocks inside outer fences are preserved as code.

Gem Extraction

Once the local plugin worked, extraction followed the same TDD pattern as the previous gems. Six modules:

  1. Configuration - Parse _config.yml settings
  2. MmdcWrapper - Shell out to mmdc, handle errors
  3. DigestCalculator - MD5 for cache keys
  4. Processor - Find and replace mermaid blocks
  5. Generator - Copy SVGs to _site/
  6. Hooks - Wire into Jekyll lifecycle

42 tests covering the module interfaces. The tests mock mmdc execution since you can’t assume Puppeteer dependencies in CI.

CodeRabbit’s Review

Three valid concerns after the initial push:

1. Cross-platform which

Kernel.system("which mmdc > /dev/null 2>&1")

The which command doesn’t exist on Windows. Fixed with pure Ruby PATH scanning:

def command_exists?(cmd)
  cmd_name = Gem.win_platform? ? "#{cmd}.exe" : cmd
  path_dirs = ENV.fetch("PATH", "").split(File::PATH_SEPARATOR)
  
  path_dirs.any? do |dir|
    File.executable?(File.join(dir, cmd_name))
  end
end

2. Windows Tempfile locking

The output Tempfile wasn’t closed before mmdc tried to write to it. Windows locks open files, so mmdc would fail. Fixed by closing the tempfile before the subprocess runs.

3. Missing SVG handling

If a cached SVG somehow disappeared, FileUtils.cp would crash the build. Added existence check with a warning instead of hard failure.

Examples in Action

The diagrams in my firmware debugging post now render as static SVGs. A flowchart that would have required 2MB of JavaScript:

Mermaid Diagram

The generated SVG is ~12KB. Wrapped in a link to itself for full-size viewing on complex diagrams.

Final Stats

  • 47 tests, all passing
  • 70%+ code coverage
  • Published to RubyGems: jekyll-mermaid-prebuild
  • Installation: gem install jekyll-mermaid-prebuild

Three gems now: jekyll-highlight-cards for styled link and image cards, jekyll-auto-thumbnails for automatic image optimization, and jekyll-mermaid-prebuild for build-time diagram rendering.

What I Learned

Jekyll’s site.data doesn’t persist the way you’d expect between hooks. Data set in :after_init may not be available in :pre_render. Use :post_read for site-wide state that needs to survive into document processing.

Puppeteer dependencies vary by platform. The error message when Chrome can’t launch is cryptic (cannot open shared object file). Detecting this failure mode and printing the exact apt-get command saves users time.

Regex backreferences match fence styles elegantly. ^\1\s*$ ensures opening and closing fences match both in character and count. No need to track state between matches.

Windows file locking affects tempfiles. On Unix, an open file descriptor doesn’t prevent other processes from writing. On Windows, it does. Close tempfiles before subprocesses write to them.

Pure Ruby PATH scanning beats shell commands for portability. which doesn’t exist on Windows, where doesn’t exist on Unix. Scanning ENV["PATH"] with File.executable? works everywhere.

The Repository

The code is at Texarkanine/jekyll-mermaid-prebuild with the full implementation history.

Three gems down, 2MB lighter per page.

~ / ..