CodeLeash

Your coding agent, on a leash

An opinionated full-stack scaffold with strong guardrails for AI-assisted development. TDD enforcement, architectural constraints, and code quality automation — because good constraints produce good code.

initial log red red_intent test fails red log green green_intent tests pass initial
01 What's Inside

TDD Guard

State machine forces Red-Green-Refactor cycle. Blocks file edits until tests fail first. Per-agent isolation via transcript-hashed log files.

hooks + state machine

10ms Timeout

Unit tests enforce a 10ms limit. Forces pure business logic — no I/O, no accidental imports. Auto-retry for transients, flamegraph SVG on failure.

pytest hook

Code Quality Checks

Custom Python scripts walking ASTs, running as pre-commit hooks. Brand colors, unused routes, soft deletes, dynamic imports, and more.

ast + pre-commit

Worker System

PostgreSQL job queue using FOR UPDATE SKIP LOCKED. Exponential backoff retries, handler registration, hot reload in dev. No external broker.

postgresql + asyncio

Worktree Isolation

Parallel development with deterministic port hashing. Each git worktree gets its own FastAPI port, Vite port, and Supabase instance.

init.sh + cksum

Full-Stack Integration

Vite + FastAPI + React with a server-to-client initial data bridge. render_page() → data-initial → useInitialData(). Type-safe from Pydantic to TypeScript.

vite + fastapi + react

Two commands.
That's it.

init.sh installs dependencies, starts Supabase, configures your environment, and installs the pre-commit hook. Then dev starts Vite, FastAPI, and the worker with hot reload.

./init.sh && npm run dev# → http://localhost:8000
02 Stack
Python FastAPI React TypeScript Supabase Vite Tailwind CSS Playwright Prometheus OpenTelemetry

Why
guardrails?

Agents need constraints, not freedom.

An unconstrained agent skips tests, makes sweeping changes, and produces code that works in isolation but breaks in context. The TDD guard exists because freedom doesn't scale.

Tests are the specification.

The 10ms timeout forces unit tests to be pure logic. The e2e harness ensures full integration. The pre-commit hook runs everything. If it isn't tested, it doesn't exist.

Lint rules should be code.

Instead of configuring complex YAML, write a Python script that walks an AST. A script is easier to write, debug, and explain than a configuration.

The monorepo is the product.

Backend, frontend, database migrations, lint rules, tests, and CI all live together. Changes that cross boundaries are normal, not exceptional.