Debugging devenv Services: When the Process Manager Lies

devenvNixDebuggingDeveloper ExperienceProcess Management

devenv up is green. Four processes, all ready, a tidy dashboard. And your app still doesn't work.

This gap — between what the process manager reports and what is actually true — is where most of a day disappears. I spent an afternoon building a deliberately broken sandbox (a Node API, a Python worker, Postgres, and Redis, all wired through devenv up) and then breaking it in five different ways, to map out where the truth actually lives when devenv lies to you. This is the guide I wish I'd had.

It starts with a way to categorize what went wrong, because the fix you reach for depends entirely on which kind of broken you're looking at.

The Four Kinds of Drift

Almost every "it worked yesterday, I swear I didn't change anything" failure is drift: the running reality has wandered away from the declared configuration. There are four flavors, and naming them is half the work.

Data drift. The config is correct, the process is healthy, but the state is wrong. In my sandbox: a database got renamed out from under the app, and every request returned database "app" does not exist. The service was up. The data was not where the service expected it. devenv's initialDatabases only runs on first cluster init — it is initial, not ensured — so nothing recreated it.

Environment drift. The declared toolchain and the installed one disagree. I uninstalled a Python package by hand; requirements.txt still listed it, but the venv no longer had it. The worker crash-looped on ModuleNotFoundError. The declaration was honest; the reality had rotted.

Config drift. The config you're reading isn't the config that's running. A devenv.local.nix — auto-imported, git-ignored by default — had env.PORT = lib.mkForce "3001" in it. git diff was clean. The file was invisible to every tool that respects version control, and it silently won the merge.

Cache drift. The nastiest. The config evaluates correctly and the running process disagrees with it, because a stale evaluation got baked into a derivation and cached. More on that one in a separate post — it turned out to be a real devenv bug. For now, just know the category exists, because if you don't, you'll delete your entire state directory and the problem will still be there.

Once you can name the drift, you reach for the matching tool.

The Toolkit

Process logs live on disk. The TUI scrolls. Your crash does not wait for you to be looking. Every process writes here:

.devenv/run/processes/logs/<name>.stdout.log
.devenv/run/processes/logs/<name>.stderr.log

This is the single most valuable path in the whole .devenv tree. When a process is in gave_up and the TUI has moved on, the death certificate is still in <name>.stderr.log. tail -f it for a live stream — devenv processes logs <name> has no --follow, so the raw file is your friend.

The devenv processes CLI mirrors the TUI. list, status <name>, logs <name> --stderr -n 500, start, stop, restart, down. All three control surfaces — TUI, CLI, and (see the next post) the MCP server — talk to the same manager over .devenv/run/processes/native.sock. Three doors, one house. The one to remember for scripts and CI is devenv processes wait: it blocks until every probe is green, which retires sleep 10 from your test setup forever.

devenv eval is an X-ray of the final config. This is the tool that ends config-drift arguments. It prints the fully merged, fully evaluated value of any option, after every import and override:

$ devenv eval env.PORT
{ "env.PORT": "3001" }

If devenv.nix says 3000 and this says 3001, you don't have a runtime problem — you have a config problem, and now you go looking for the file that wrote 3001. (First move when a config mystery appears: ls *.nix. The shadow file is usually right there.)

lsof answers "who owns this port?" When a service quietly relocates — devenv reallocates to the next free port by default if one is taken — lsof is how you catch it:

$ lsof -i :6379 -i :5432 -i :3000

I've made it a discipline: before devenv up, check the ports are actually free. A restart reflex is evidence destruction; a sterile port list is evidence.

Read Nix error walls from the bottom. A failed evaluation prints a screen of … while evaluating … context. It's a stack trace; the actual message is at the very bottom:

$ devenv shell 2>&1 | tail -25

devenv's own errors frequently hand you the exact fix. The one that scrolled past me said, in full: To use 'languages.python.version', run: devenv inputs add nixpkgs-python …. The answer was in the error the whole time.

The Cheat Sheet

When something is broken, match the symptom to the first place to look:

SymptomDriftFirst look
Service up, wrong/missing stateDatapsql/redis-cli directly into the service
Crash-loop on startupEnvironmentprocesses logs <name> --stderr
Config doesn't match the fileConfigdevenv eval <option>, then ls *.nix
eval is right, runtime is wrongCachecompare store paths (next-level)
Service moved ports silentlylsof -i :PORT

None of this requires deep Nix knowledge. It requires knowing that the process manager's ready is a claim, not a proof — and knowing the five places where the proof actually lives.

The discipline underneath all of it is one rule: no fix before root cause. A guess that happens to work teaches you nothing and leaves complexity behind. Gathering evidence first is almost always cheaper than three wrong guesses. The tools above are just the fastest paths to that evidence.

References

Next: handing this entire toolkit to an AI assistant, so it can pull the logs and poke the processes itself.

All Articles