Use Cases as the Single Source of Truth — How Agentic Coding Helped Me Modernize a SaaS Platform

Agentic CodingAITestingDeveloper Experience

Earlier this year, I found myself in a situation that many solo developers and small teams will recognize: a B2B SaaS platform with a codebase that had grown organically over the years. Java backend, multiple frontend clients, classic PaaS hosting. Technical debt had piled up. Backend and frontends lived in separate repositories. There wasn't a single end-to-end test. Every developer had to manually cobble together their local environment. The infrastructure was expensive and operationally complex.

The modernization was overdue. But time was scarce — all of this happened alongside a client project, in evenings and on weekends. No team I could split the work across. No quarter I could block off.

The question wasn't what needed to be modernized — that was obvious. The question was: How do you use limited time effectively enough to end up with a solid, tested, deployable platform instead of yet another half-finished rewrite?

The answer wasn't a tool. It was a realization.

Agentic Coding Is Not "AI Writes Code"

When most developers think about AI-assisted coding, they think about autocomplete on steroids. A copilot that generates boilerplate or completes functions. That's useful, but it's not the real leverage.

The actual productivity gain of agentic coding — working with an AI agent that autonomously reads files, writes code, runs tests, and verifies results — lies somewhere else entirely: it forces you to articulate clearly what you actually want to build.

When you have to explain to an agent what it should implement, you have to understand it yourself first. Not vaguely, not "I'll figure it out while coding," but concretely: What are the preconditions? What happens step by step? What's the expected result? What side effects are there — like email notifications?

My project didn't have use cases when I started. Like most projects, requirements lived in a ticket board. Once a ticket was moved to "Done," the knowledge was gone — buried in a backlog no one reads, or scattered across comments and conversations. The first thing I did was write use cases and put them in the repository. Not in a wiki. Not in a project management tool. In the source code, version-controlled, right next to the code they describe.

Nothing overly formal — simple Markdown documents with an ID, the steps, and the expected outcome. But once they were there, they became the linchpin of everything.

A use case like "Create Task" describes: What role does the user have? What do they see? What do they fill in? What happens after submission — is the task saved, does it appear in the list, is a notification sent? From this single document, everything follows:

  • The agent gets a spec describing what to build
  • From the spec comes a plan with concrete implementation steps
  • The plan becomes code — API endpoint, service logic, frontend component
  • From the steps and expected results comes an E2E test
  • The test verifies that the use case works

The use cases became the single source of truth. They drive what the agent implements. They define what the tests verify. And they are the acceptance criteria: a feature is done when the E2E test passes. Not when the code compiles. Not when it works on my machine. When the automated test plays through the entire use case and confirms it.

This eliminates the most expensive mistake you can make when time is scarce: starting to build before you know what you're building.

The Workflow: From Requirement to Verified Software

The pipeline that emerged:

Use Case → Spec → Plan → Code → E2E Test → Verification

Sounds like overhead. Feels like it at first, too. But every hour I invested in a spec saved me a day of debugging. The individual building blocks:

An agent briefing instead of prompts. The project has a CLAUDE.md — a structured document that tells the agent everything it needs to know: architecture, domain model, conventions, directory structure, development workflow. No repeated explanations every session. Written once, always available. Plus a memory system where the agent retains learnings, feedback, and project context across sessions. (→ Deep dive: devenv + CLAUDE.md)

A reproducible dev environment. A single command — devenv up — starts the database, the backend with migrations and seed data, all frontend clients, and a mail interceptor. Every start is identical. The agent can run tests and gets a deterministic result. No "works on my machine" moments. (→ Deep dive: devenv + CLAUDE.md)

Specs and plans before code. The agent doesn't write a single line of code before a plan is in place. This feels slow — but it prevents the agent from heading in the wrong direction while you only notice two hours later.

E2E tests as proof. A single command starts the entire stack and runs all tests. Each test plays through a use case — including mail verification via a local SMTP interceptor. No manual clicking, no "I quickly checked it in the browser." (→ Deep dive coming soon: From 0 to 50+ E2E Tests.)

What Got Built

A sober list of what happened during this period:

Monorepo consolidation. Five separate repositories merged into a single monorepo with pnpm workspaces. A shared TypeScript API client generated from the OpenAPI spec. A shared UI component library. Atomic commits: a backend API change and the frontend adaptation in a single commit, not scattered across three repos.

E2E testing from zero to over 50 tests. Each test has an ID that maps directly to a use case. Mail verification via a local SMTP interceptor — no real mail server needed. A single command starts the stack and runs the full suite. (→ Deep dive coming soon: From 0 to 50+ E2E Tests.)

Three infrastructure migrations with zero downtime. From classic PaaS through a container service to the current setup. Hosting costs reduced by roughly 85%. Everything as infrastructure as code with Pulumi in TypeScript, replacing hand-maintained CloudFormation templates. Plus structured logging and a monitoring dashboard. (→ Deep dive coming soon: 3 Cloud Platforms in 6 Weeks.)

New clients. An admin interface with React and Vite — spec written in the morning, a working client by the end of the day. A rewrite of the legacy frontend. Everything in the monorepo with shared dependencies.

Developer experience. One command starts the full stack. One command runs all E2E tests. Conventional commits with automatic changelog generation. No more setup documentation that's outdated after two weeks.

What I Learned

Documentation is not overhead — it's a productivity tool. The agent briefing is the most valuable file in the repo. The better the briefing, the better the output. The worse it is, the more time you spend on corrections and feeding context after the fact. The same applies to use cases: the more precisely they're written, the fewer iterations the implementation needs.

Specs before code feels slow but is faster. I had to learn this the hard way. Twice I sent the agent off without a spec because "this one's pretty straightforward." Both times I spent more time on cleanup than the spec would have cost.

Use cases belong in the repo, not in a ticket board. Tickets describe work packages. They get moved to "Done" and disappear. Use cases describe behavior — and that behavior doesn't stop being relevant once the feature ships. In the source code, use cases become the abstract brain of the project: living documentation that drives implementation, tests, and future changes. They're version-controlled, they evolve with the code, and they're always there when the agent — or you — needs to understand what the system is supposed to do.

AI has limits. Architecture decisions — which hosting platform, how the roles and permissions system works, when a refactoring is necessary — stay with the developer. The agent is a multiplier for implementation speed, not a replacement for technical judgment. It makes you faster at what you already know. It doesn't make you better at what you don't yet understand.

Who This Works For

This approach isn't a silver bullet. But it works particularly well in three situations:

  • Solo developers with their own product who need to make the most of limited time
  • Small teams with a large backlog who want to ship more without hiring more people
  • Projects with technical debt that need to be modernized alongside the day-to-day

If you're in one of these situations and want to know how to set up an agentic coding workflow for your project — from agent briefing through dev environment to E2E pipeline — reach out.


Over the coming weeks, I'll publish three deep dives into the individual building blocks:

  1. devenv + CLAUDE.md: A Reproducible Stack for Agentic Coding
  2. From 0 to 50+ E2E Tests With a Single Command (coming soon)
  3. 3 Cloud Platforms in 6 Weeks — Infrastructure Migration With AI Support (coming soon)

References

All Articles