← all posts

· Performance · 8 min read

Cutting musicmod's CI from 12 minutes to 2

The receipts

Recent musicmod tag builds, taken straight from gh run list:

Tag Build time Notes
v1.0.29 24m 49s the worst
v1.0.31 15m 42s
v1.0.30 10m 50s
v1.0.32 11m 38s typical
v1.0.34 10m 27s last build under the old workflow
v1.0.35 6m 22s first build with the new workflow, cold cache
v1.0.36 2m 07s first build with the cache warmed up
v1.0.41 1m 57s
v1.0.42 2m 03s
v1.0.43 1m 53s a recent one

Median dropped from ~12 minutes to ~2 minutes. About a 6× speedup, and crucially, the per-tag cost stopped being a thing I had to budget for.

Why it mattered

musicmod — Tempo, the karaoke / music player — ships every change behind a tagged release. I push tags often: a feature, a fix, a UX tweak, a rendering polish. In April I tagged it ~13 times in two days during a UI sprint. Each tag = one CI run. At ~12 minutes a pop, that's ~2.5 hours of GitHub-hosted Linux runner time for a small frontend project.

I'm on the Pro plan. 3000 minutes free per month, then $0.008 per minute. After the second time GitHub Billing emailed me to say I'd burned through my padding budget, I went looking for what was actually expensive.

The forensics

The Enhanced Billing API has a per-repo breakdown the rolled-up "minutes used" page hides:

gh api '/users/awkto/settings/billing/usage?year=2026&month=4' \
  | jq '.usageItems[] | {repo: .repositoryName, minutes}'

Two repos accounted for 90% of my April spend. One was openbao-fork-old — a CI workflow on a fork I was no longer maintaining, building on every push because it had inherited the upstream's matrix. I disabled the workflow file outright. Done; that was easy.

The interesting one was musicmod, sitting at 1807 minutes for the month. Over the same period I'd cut ~30 tagged releases. That's ~60 minutes per tag. Two parallel jobs of ~10 minutes each = billed concurrently, so the per-tag cost was ~20 minutes after GitHub's parallelism rules. The math worked out, the spend was real, and the question was: is any of that work actually load-bearing?

What was actually being built

The musicmod release workflow was a strategy.matrix of two image variants:

matrix:
  include:
    - name: allinone-cpu
      image: awkto/tempo
      dockerfile: ./Dockerfile
    - name: worker-cpu
      image: awkto/tempo-worker
      dockerfile: ./Dockerfile.worker
    # CUDA variant disabled previously — runners ran out of disk

The all-in-one image is the only thing actually deployed: a single container with nginx + FastAPI + ARQ + Redis + supervisord, packaged for hobbyist self-hosters who want one docker run line. The worker image was an artifact of an earlier scale-out plan that hadn't materialised. No one was running a split worker against this codebase, including me. Three or four minutes of build time per tag, on every tag, for an image with zero deployments.

The other suspicious thing was the cache. Both jobs used:

cache-from: type=gha,scope=${{ matrix.name }}
cache-to: type=gha,scope=${{ matrix.name }},mode=max

GitHub Actions cache (type=gha) is a per-repo blob store with a 10 GB hard ceiling and aggressive eviction — least-recently-used entries get nuked when the limit is hit. The musicmod base layer is enormous: torch + demucs + the Python venv runs about 5 GB on disk. Two matrix entries, each pushing 5 GB of cache, with a 10 GB ceiling. The cache was constantly evicting itself. Builds were paying the full no-cache cost on every run.

So: I was building an image no one used, on a cache that didn't survive between builds, twice in parallel.

The three changes

1. Drop the unused matrix entry

The simplest win. Cut worker-cpu from the matrix:

matrix:
  include:
    - name: allinone-cpu
      image: awkto/tempo
      dockerfile: ./Dockerfile
      variant_suffix: ""
      short_description: "Tempo — self-hosted karaoke / music player (all-in-one)"

That's a whole job (and ~3-4 minutes of builder time) gone per tag. Re-adding it later is one matrix entry away.

2. Switch buildx cache to a registry-backed manifest

cache-from: type=registry,ref=${{ matrix.image }}:buildcache
cache-to: type=registry,ref=${{ matrix.image }}:buildcache,mode=max

The cache now lives as a tagged manifest in Docker Hub — awkto/tempo:buildcache — alongside the published image tags. Docker Hub doesn't have a 10 GB cap; it has a "use a reasonable amount of free storage" cap, and a 5 GB cache fits comfortably. Layers are deduplicated against the actual published image, so storing the cache costs roughly nothing extra.

The other property that matters: registry caches don't get evicted between builds. Every tagged release reuses the same cache key. The first build after this change had to populate it (the "cold" 6m22s run for v1.0.35); every subsequent build skipped the entire uv sync step because the resulting venv layer was already in the cache. That's where the bulk of the savings come from.

3. Add a .dockerignore

Less dramatic but worth doing. Without one, docker build uploads the entire repo as build context — node_modules, the data dir with bind-mounted SQLite WAL, IDE state, the .git folder. For musicmod that was about 600 MB of pointless upload before any layer started.

.git
**/node_modules
frontend/dist
**/__pycache__
**/.venv
data
android
**/*.tsbuildinfo
.github
*.md

Two effects:

  1. The context tarball goes from ~600 MB to ~20 MB. A few seconds saved per build.
  2. Random local file changes don't cache-bust the build. Without .dockerignore, editing frontend/node_modules (which I don't, but npm install sometimes does on a different schedule than I expect) would invalidate every layer that came after COPY frontend ./. With .dockerignore, the build is reproducible from the repo's intended state, not its working-tree noise.

This one's also defence-in-depth: it stops me from ever shipping something stupid in the build context, like a stray .env.

What I deliberately didn't change

A few things I considered and decided weren't worth it:

What this looks like in gh run list

Before, in early April:

v1.0.29   24m 49s
v1.0.31   15m 42s
v1.0.32   11m 38s
v1.0.30   10m 50s
v1.0.28   11m 15s

After, in the last week of April:

v1.0.43    1m 53s
v1.0.42    2m 03s
v1.0.41    1m 57s
v1.0.40    1m 51s
v1.0.39    1m 47s
v1.0.38    1m 52s
v1.0.37    1m 46s
v1.0.36    2m 07s

The transition is the v1.0.35 build — 6m 22s — where the registry cache had to populate from scratch. After that, every subsequent build paid only for the layers that actually changed.

The lesson

The win didn't come from a clever build trick. It came from two pieces of waste sitting in the workflow file that I'd never re-read:

  1. A matrix entry that was producing artefacts no one used.
  2. A cache backend that was structurally incapable of surviving between builds.

Both had been there since I scaffolded the repo. Both were obvious in hindsight. Neither would have ever bothered me if GitHub hadn't started billing for the consequences.

CI is the kind of thing where a small permanent waste compounds into a real number you eventually notice. Worth re-reading the workflow file every few months and asking "is anything in here still doing what I think it's doing?"