· Infra · 9 min read
Why GitLab has a separate secrets file — and why Geo doesn't sync it
I did a planned Geo failover last week — primary on one box, secondary on another, flip a CNAME and the secondary becomes primary. The promotion itself worked. Then my API token stopped authenticating.
The cause turned out to be a small file I'd skipped during initial Geo setup: /etc/gitlab/gitlab-secrets.json. The previous post on the failover covered the operational fix in one paragraph. This post is about why that file exists at all, what it actually does, and why GitLab deliberately keeps it out of the database — which is also why no replication mechanism syncs it automatically.
I spent more time learning this than I expected. So here it is.
The two-store split
Mental model first, because it makes everything else click:
The database is the safe.
gitlab-secrets.jsonis the key to the safe.
GitLab's PostgreSQL database holds essentially everything users create — issues, merge requests, comments, project metadata, CI variables, webhook URLs, runner tokens, OAuth applications, 2FA seeds, the whole user-visible product. It also holds secrets: things you definitely don't want a database backup leak to expose, like the auth token a webhook uses to post to your private Slack.
To make that subset of fields safe to store, GitLab encrypts those specific columns before they hit Postgres, using keys that live outside the database. Those keys are in gitlab-secrets.json on every GitLab node.
So when you create a CI variable marked "Protected", here's what happens:
- Your browser sends
MY_SECRET=hunter2to Rails. - Rails reads the master encryption key (
db_key_base) from/etc/gitlab/gitlab-secrets.json. - Rails encrypts
hunter2with that key. - Rails writes the ciphertext into
ci_variables.encrypted_value. - When a CI job needs the value, Rails reads the ciphertext back out of PG and decrypts it with the same key.
If somebody walks off with your nightly PG dump but doesn't have the secrets file, every encrypted column is opaque to them. Without the keys, the safe is welded shut.
What's actually in the file
Despite the name, it's not a single secret — it's a JSON document with keys belonging to half a dozen GitLab subsystems. A real (slightly redacted) example:
{
"gitlab_rails": {
"db_key_base": "873f54e2...",
"secret_key_base": "75e20d32...",
"otp_key_base": "3447fc6a...",
"encrypted_settings_key_base": "9a1d2f04...",
"openid_connect_signing_key": "-----BEGIN RSA PRIVATE KEY-----\n..."
},
"gitlab_shell": { "secret_token": "..." },
"gitlab_workhorse": { "secret_token": "..." },
"gitlab_kas": { "private_api_secret_key": "...",
"internal_api_secret_key": "..." },
"registry": { "http_secret": "..." }
}
Each key does a different job, with a different blast radius if compromised:
| Key | Used for | Why a separate key? |
|---|---|---|
db_key_base |
The heavy lifter — encrypts most sensitive DB columns | Rotating this requires re-encrypting touched data |
secret_key_base |
Signs Rails session cookies (MAC, not encryption) | Different operation (signing), can rotate independently |
otp_key_base |
Encrypts 2FA TOTP seeds | Isolated so a leak of db_key_base doesn't compromise 2FA |
encrypted_settings_key_base |
Newer-pattern encrypted columns in application_settings |
Modern ActiveRecord::Encryption style |
openid_connect_signing_key |
RSA private key for OIDC token signing | This isn't a symmetric key at all; it's an identity key |
gitlab_workhorse.secret_token |
Service-to-service auth between Rails and Workhorse | Internal API auth, not user data |
gitlab_shell.secret_token |
Service-to-service auth Rails ↔ gitlab-shell | Same |
gitlab_kas.private_api_secret_key |
KAS (Kubernetes Agent Server) ↔ Rails | Same |
registry.http_secret |
Container registry shared secret | Registry trusts Rails-issued tokens via this |
The naming is fascinating archaeology. Some of these were added in 8.x, some in 14.x, some last year. GitLab has consciously split keys over time — the otp_key_base separation, for example, came after a security advisory. If a future bug exposes the db_key_base, your users' 2FA seeds are still safe behind their own key.
So what's encrypted vs not?
This was the part I had to read several issues on the GitLab tracker to internalize. Most of what you see in the GitLab UI is plaintext in Postgres. Encryption only kicks in for specific columns of specific tables:
Encrypted columns (need gitlab-secrets.json to read):
integrations.encrypted_properties— Slack/Jira/Mattermost webhook URLs and tokensci_variables.encrypted_value,ci_group_variables.encrypted_value,ci_instance_variables.encrypted_value— masked/protected pipeline secretsci_runners.encrypted_token— runner registration & authentication tokensweb_hooks.encrypted_token,web_hooks.encrypted_url— outbound webhook auth + URLusers.encrypted_otp_secret— 2FA TOTP seeds (usesotp_key_base)application_settings.encrypted_*— many fields: SMTP password, registration tokens, etc.oauth_access_tokens.token— live OAuth tokens (in recent versions)personal_access_tokens.encrypted_tokenfor some token classes — but not PATs themselves, see below
Plaintext (anyone with PG access can read directly):
- Issue and MR bodies, descriptions, comments
- Project, group, user names; emails; usernames
- Repo metadata (paths, default branches)
- Audit events
- Pipeline status, job logs, artifacts metadata
- Anything you'd consider "content" rather than "credential"
And one important exception:
personal_access_tokens.token_digestis a SHA-256 hash, not an encryption.
PATs are stored as a one-way digest. When you send Authorization: Bearer glpat-xxx, GitLab hashes the value and compares it to the digest in the DB. No encryption key is involved on the auth side. This is why a PAT created on one Geo site keeps working on a promoted site even if the secrets files are mismatched — the lookup doesn't need a key at all. (It's also why a leaked DB dump can't be used to forge tokens directly: hashes go one way.)
So if the secrets file goes missing, you don't lose user content — you lose access to the credentials GitLab stores on behalf of users.
Why two stores instead of one big encrypted database?
You could imagine GitLab just encrypting the entire database transparently — Postgres TDE, encrypted-at-rest filesystem, that kind of thing. Why bother with this columns-only thing?
Three practical reasons I can think of:
1. Performance. Encrypting everything means every SELECT issues WHERE project_id = ? decrypts every issue body it returns. That's prohibitively expensive when the vast majority of data isn't actually sensitive. Column-level encryption lets you decrypt only the rare hot fields, and lets you index plaintext columns naturally.
2. Different secrets have different threat models. A db_key_base leak from a config-management bug shouldn't simultaneously compromise 2FA. By splitting into multiple keys, GitLab can reason about each separately and rotate them on different cadences.
3. Backups and ops. A plaintext-mostly DB can still be pg_dump'd, copied, restored to a dev environment, browsed with psql — everything works. Encrypted-everywhere makes routine ops painful, and people work around that pain by leaving keys lying next to backups, defeating the purpose.
This is roughly the same calculus that drives application-level "envelope encryption" in cloud providers: encrypt the few high-value bits, leave the bulk indexable.
Why doesn't Geo sync the file?
PostgreSQL streaming replication sees one thing: WAL records from one Postgres to another. WAL contains changes to the database. The file /etc/gitlab/gitlab-secrets.json lives in the OS filesystem, generated by gitlab-ctl reconfigure, owned by root. Postgres has no idea it exists. Geo's repository/file synchronization handles things like Git repos and LFS objects through GitLab's own machinery, but /etc/gitlab/ is not in scope.
So the file genuinely has to be copied manually, exactly once, before the secondary's first gitlab-ctl reconfigure. GitLab's Geo setup docs say this. I skipped it. Several days later, after a failover, my PAT broke and I had to retrofit the fix.
The fix retroactively: copy the canonical (primary's original) gitlab-secrets.json over to the secondary, then gitlab-ctl reconfigure. Any data the secondary wrote with its own bogus keys between setup and the fix becomes un-decryptable — for me, that was zero data because nothing had run on the secondary yet. The risk grows the longer you wait.
Practical consequences
These rules fall out of the model:
Initial Geo setup: copy
gitlab-secrets.jsonfrom primary to secondary beforegitlab-ctl reconfigureon the secondary. This is the only step that the standard tutorials half-mention and that "feels" like it should be automated but isn't.Failover doesn't require copying anything new — the file is identical on both sides and stays that way through promotion/demotion.
GitLab upgrades might add new keys. A version that introduces a new feature with a new encrypted field also adds a new top-level key to
gitlab-secrets.jsonduring reconfigure. Upgrade procedure for Geo is therefore: upgrade primary, copygitlab-secrets.jsonto secondary, upgrade secondary, hold both sides at the new version. If you upgrade the secondary first or skip the copy, you can end up with two different shaped files and bizarre decryption errors after the next failover.Manual key rotation requires manual file replication. If you ever
gitlab-rake gitlab:db:rotate_encrypted_columnson the primary, the new keys land in the primary's file. Copy to secondary, reconfigure.A drift detector is enough; bidirectional sync is overkill. Cron a daily sha256 comparison between the two files and alarm on mismatch. Trying to keep them rsync'd in real time introduces a failure mode where corruption on one side propagates instantly.
Backups: include this file. A perfect database backup with no
gitlab-secrets.jsonis a useless backup — you'll have an unencryptable husk of CI variables and webhook URLs. Put it in whatever backs up/etc/gitlab/.Disk encryption is orthogonal. GitLab's column encryption protects the high-value secrets from a DB dump leak. It does not protect repo contents (those live in plain git data on disk via Gitaly), comments, or issue bodies. For true encryption-at-rest across the board you stack column encryption on top of LUKS/ZFS-encrypted volumes and SSE-enabled object storage. GitLab handles the credentials layer; the rest is your responsibility.
TL;DR
gitlab-secrets.json holds the master keys for application-level column encryption in GitLab's database, plus inter-service auth tokens. The bulk of the database is plaintext; only specific high-value columns (CI variables, webhook tokens, runner tokens, 2FA seeds, encrypted application settings) are encrypted, and those encryptions need this file's keys to decrypt.
It lives outside the database for performance, blast-radius isolation, and operational convenience — and that's precisely why PostgreSQL replication does not and cannot sync it. For any Geo or HA setup, you copy it from primary to secondary once, before the secondary's first reconfigure, and then occasionally after GitLab upgrades that add new key fields. The rest of the time it sits there inert, doing exactly one job perfectly.
If you've ever set up GitLab Geo and your secondary's PATs and CI variables seemed to "just work for a while" and then mysteriously not — check the sha256 of gitlab-secrets.json on both sides first. Nine times out of ten, that's it.