Files
apophis-fastify/docs/attic/CLI_EXECUTION_GUIDE.md

27 KiB

APOPHIS CLI Execution Guide

1. Purpose

This file defines the CLI redesign contract. It is written for parallel implementers. Each stream owns an end-to-end command. The orchestrator owns specs, fixtures, and golden outputs. Merge gates are strict and minimal.

2. Philosophy

  • Vertical slices, not horizontal layers. Each stream goes straight to a complete command endpoint.
  • Acceptance tests first. Every stream starts with failing top-level tests, then implements until green.
  • No premature extraction. Shared helpers are extracted only after two or more streams prove the same seam.
  • Fast local feedback. Every stream should be runnable and testable in isolation.
  • Authoritative merge gates only. Spec compliance, golden snapshots, fixture end-to-end runs, and latency budgets.

3. Frozen Contracts (Orchestrator-Owned)

These must not change without orchestrator approval. All streams code against them.

3.1 Command Vocabulary

Command Purpose
apophis init Scaffold config, scripts, and example usage
apophis verify Run deterministic contract verification
apophis observe Validate runtime observe configuration and reporting setup
apophis qualify Run scenario, stateful, protocol, or chaos-driven qualification
apophis replay Replay a failure using seed and stored trace
apophis doctor Validate config, environment safety, docs/example correctness
apophis migrate Check and rewrite deprecated config or API usage

3.2 Global Flags

Every command must accept:

  • --config <path>
  • --profile <name>
  • --cwd <path>
  • --format human|json|ndjson
  • --color auto|always|never
  • --quiet
  • --verbose
  • --artifact-dir <path>

3.3 Exit Codes

Code Meaning
0 Success
1 Behavioral / qualification failure
2 Usage, config, or environment safety violation
3 Internal APOPHIS error
130 Interrupted (SIGINT)

3.4 Config Schema (TypeBox + Ajv)

Config must be validated with strict unknown-key rejection. Use TypeBox to define the schema so JSON Schema output is available for docs and IDE support.

Key schema requirements:

  • mode?: 'verify' | 'observe' | 'qualify'
  • profile?: string
  • preset?: string
  • routes?: string[]
  • seed?: number
  • artifactDir?: string
  • environments?: Record<string, EnvironmentPolicy>
  • profiles?: Record<string, ProfileDefinition>
  • presets?: Record<string, PresetDefinition>

Unknown keys at any depth must produce a hard failure with exact key path.

3.5 Artifact Schema

Every verify, observe, and qualify run must produce an artifact document:

{
  "version": "apophis-artifact/1",
  "command": "verify",
  "mode": "verify",
  "cwd": "/path/to/project",
  "configPath": "apophis.config.js",
  "profile": "quick",
  "preset": "safe-ci",
  "env": "local",
  "seed": 42,
  "startedAt": "2026-04-28T12:30:00Z",
  "durationMs": 1234,
  "summary": {
    "total": 10,
    "passed": 9,
    "failed": 1
  },
  "failures": [
    {
      "route": "POST /users",
      "contract": "response_code(GET /users/{response_body(this).id}) == 200",
      "expected": "200",
      "observed": "404",
      "seed": 42,
      "replayCommand": "apophis replay --artifact reports/apophis/failure-2026-04-28T12-30-22Z.json"
    }
  ],
  "artifacts": [
    "reports/apophis/failure-2026-04-28T12-30-22Z.json"
  ],
  "warnings": [],
  "exitReason": "behavioral_failure"
}

3.6 Human Output Grammar

For --format human, every failure must follow this exact shape:

Contract violation
POST /users
Profile: quick
Seed: 42

Expected
  response_code(GET /users/{response_body(this).id}) == 200

Observed
  GET /users/usr-123 returned 404

Why this matters
  The resource created by POST /users is not retrievable.

Replay
  apophis replay --artifact reports/apophis/failure-2026-04-28T12-30-22Z.json

Next
  Check the create/read consistency for POST /users and GET /users/{id}.

This is the canonical human failure format. Do not deviate without orchestrator approval.

3.7 Machine Output Schema

--format json must emit a single stable document matching the artifact schema.

--format ndjson must emit step events:

{"type":"run.started","command":"verify","seed":42,"timestamp":"2026-04-28T12:30:00Z"}
{"type":"route.started","route":"POST /users","timestamp":"2026-04-28T12:30:01Z"}
{"type":"route.passed","route":"POST /users","durationMs":123,"timestamp":"2026-04-28T12:30:01Z"}
{"type":"route.failed","route":"POST /users","failure":{...},"timestamp":"2026-04-28T12:30:02Z"}
{"type":"run.completed","summary":{...},"timestamp":"2026-04-28T12:30:03Z"}
Concern Tool Why
Command parser cac Fast, small, zero ceremony
Config/artifact validation TypeBox + Ajv Fast, strict, JSON Schema output
Interactive setup @clack/prompts (lazy-loaded) Polished init, zero startup tax elsewhere
Color/styling picocolors Tiny, sufficient
Output layout Custom renderer Better than heavy task/spinner frameworks
CLI bundling tsup Fast cold start, single bin
Tests node:test + golden fixtures Already aligned with repo
Filesystem/glob Node built-ins + minimal helper Lean startup

Avoid: yargs, commander, heavy spinner UIs, ad hoc config validation.

5. Directory Ownership

Each stream owns its directory. No stream touches another stream's directory without orchestrator-mediated extraction.

src/
  cli/
    core/
      index.ts          # S1: entrypoint, command registration
      context.ts        # S1: cwd, env, TTY detection
      config-loader.ts  # S2: config resolution, profile/preset resolution
      policy-engine.ts  # S2: env gating, safety checks
      exit-codes.ts     # S0: exit code constants
      types.ts          # S0: shared CLI types
    commands/
      init/
        index.ts        # S3
        scaffolds/      # S3: preset templates
      verify/
        index.ts        # S4
        runner.ts       # S4: deterministic run logic
      observe/
        index.ts        # S5
        validator.ts    # S5: observe config validation
      qualify/
        index.ts        # S6
        runner.ts       # S6: scenario/stateful/chaos runner
      replay/
        index.ts        # S7
        loader.ts       # S7: artifact loading, version checks
      doctor/
        index.ts        # S8
        checks/         # S8: individual diagnostic checks
      migrate/
        index.ts        # S9
        rewriters/      # S9: config rewriters
    renderers/
      human.ts          # S10
      json.ts           # S10
      ndjson.ts         # S10
      shared.ts         # S10
    __fixtures__/       # S12: fixture apps
    __goldens__/        # S12: golden output snapshots
  test/
    cli/                # S12: CLI acceptance tests

6. Workstreams

S0: Spec Authority (Orchestrator)

Owner: Orchestrator thread only.

Responsibilities:

  • Own all files in src/cli/core/types.ts, src/cli/core/exit-codes.ts
  • Own src/cli/__goldens__/*
  • Own fixture app definitions in src/cli/__fixtures__/*
  • Approve or reject contract changes requested by implementation streams
  • Merge arbitration: resolve conflicts, enforce golden compliance

Done when:

  • All other streams can import from src/cli/core/types.ts and src/cli/core/exit-codes.ts
  • Golden snapshots exist for every command's --help and canonical failure output
  • Fixture apps cover: tiny Fastify, broken-behavior, monorepo, protocol-flow, observe-config, legacy-config

S1: CLI Kernel

Owner: One LLM thread.

Directory: src/cli/core/ (except types.ts and exit-codes.ts)

Responsibilities:

  • Entrypoint: src/cli/core/index.ts
  • Command registration with cac
  • Global flag parsing and normalization
  • Context loading: cwd, env vars, TTY/CI detection
  • Error boundary: catch unexpected errors, print internal error banner, write debug artifact
  • Help text generation

Acceptance tests (start here, all failing):

  1. apophis --help matches golden snapshot
  2. apophis verify --help matches golden snapshot
  3. apophis --version prints version
  4. apophis unknown-cmd exits 2 with clear message
  5. apophis verify --unknown-flag exits 2 with exact flag name
  6. Non-TTY shell disables prompts and spinners
  7. CI env disables spinners and fancy rendering

Done when: All acceptance tests pass and other commands can register cleanly.

S2: Config + Policy

Owner: One LLM thread.

Directory: src/cli/core/config-loader.ts, src/cli/core/policy-engine.ts

Responsibilities:

  • Config file discovery (.js, .ts, .json, package.json field)
  • Config loading with tsx for .ts files
  • Profile resolution from config
  • Preset resolution and application
  • Environment policy enforcement
  • Unknown-key hard failure with exact path
  • Monorepo boundary detection

Acceptance tests (start here, all failing):

  1. Loads apophis.config.js from cwd
  2. Loads config from --config override
  3. Rejects unknown key with exact path
  4. Resolves profile from config
  5. Applies preset correctly
  6. Blocks qualify in production env by default
  7. Detects monorepo package boundary
  8. Suggests apophis init when no config found

Done when: Every command resolves config identically and policy gates are authoritative.

S3: Init

Owner: One LLM thread.

Directory: src/cli/commands/init/

Responsibilities:

  • apophis init --preset <name>
  • Detect Fastify app structure
  • Write scaffold files (config, example route guidance, package script)
  • Support --force for overwrite
  • Noninteractive mode with explicit flags
  • Idempotent rerun behavior
  • Print exact next command after init

Acceptance tests (start here, all failing):

  1. apophis init --preset safe-ci writes correct files in empty repo
  2. Detects existing Fastify entrypoint
  3. Refuses overwrite without --force
  4. Merges package scripts without clobbering
  5. Noninteractive mode works with all required flags
  6. Missing @fastify/swagger produces clear guidance
  7. Idempotent rerun updates only changed scaffold parts
  8. Prints exact next command: apophis verify --profile quick --routes "POST /users"

Done when: Fresh repo gets to first verify in one pass.

S4: Verify

Owner: One LLM thread.

Directory: src/cli/commands/verify/

Responsibilities:

  • apophis verify --profile <name> --routes <filter>
  • Route selection and filtering
  • Deterministic contract verification
  • Seed generation and emission
  • Failure reporting with canonical human output
  • Artifact emission
  • Replay command generation
  • --changed support for git-based route filtering

Acceptance tests (start here, all failing):

  1. apophis verify --profile quick runs all routes with behavioral contracts
  2. --routes "POST /users" filters correctly
  3. Finds the canonical behavioral failure: POST /users creates an unretrievable resource
  4. Failure output matches golden snapshot exactly
  5. Emits artifact with correct schema
  6. Prints replay command
  7. Seed is generated and printed when omitted
  8. --changed filters to modified routes
  9. No routes matched produces clear failure with available matches
  10. No behavioral contracts found explains schema-only is not enough

Done when: The first behavioral failure is reliable and replay works.

S5: Observe

Owner: One LLM thread.

Directory: src/cli/commands/observe/

Responsibilities:

  • apophis observe --profile <name> --check-config
  • Validate observe configuration
  • Check reporting sink setup
  • Validate non-blocking semantics
  • Environment safety checks
  • Explain what would be checked and why it is safe

Acceptance tests (start here, all failing):

  1. apophis observe --profile staging-observe validates config
  2. Blocking behavior in prod is blocked by default
  3. Invalid sampling rate fails with exact bounds
  4. Missing sink config tells user what is required
  5. Observe profile referencing qualify-only feature is blocked
  6. --check-config only validates, does not activate
  7. Output explains safety boundaries clearly

Done when: Staging/prod safety checks are crisp and trustworthy.

S6: Qualify

Owner: One LLM thread.

Directory: src/cli/commands/qualify/

Responsibilities:

  • apophis qualify --profile <name> --seed <n>
  • Scenario execution
  • Stateful execution
  • Chaos execution
  • Profile gating
  • Rich artifact emission
  • Non-prod boundary enforcement

Acceptance tests (start here, all failing):

  1. apophis qualify --profile oauth-nightly --seed 42 runs OAuth scenario
  2. Prod run is blocked by default
  3. Chaos on protected routes is blocked without allowlist
  4. Scenario with outbound mocks not allowed in env is blocked
  5. Cleanup failure is reported separately without hiding primary failure
  6. Emits rich artifact with step traces
  7. Seed is generated and printed when omitted

Done when: Deeper realism works without contaminating normal CI.

S7: Replay

Owner: One LLM thread.

Directory: src/cli/commands/replay/

Responsibilities:

  • apophis replay --artifact <path>
  • Artifact loading and validation
  • Version compatibility checks
  • Seed replay
  • Degraded replay guidance when source changed
  • Fast startup (p95 under 500 ms on the CLI fixture environment)

Acceptance tests (start here, all failing):

  1. apophis replay --artifact <path> reproduces exact failure
  2. Missing artifact fails with exact path
  3. Corrupted artifact explains parse/validation failure
  4. Source code changed since artifact warns but attempts replay
  5. Referenced route no longer exists explains drift
  6. CLI version mismatch shows compatibility message
  7. Startup p95 is under 500 ms on the CLI fixture environment

Done when: Every verify/qualify failure is reproducible with one command.

S8: Doctor

Owner: One LLM thread.

Directory: src/cli/commands/doctor/

Responsibilities:

  • apophis doctor
  • Dependency checks (Fastify, swagger, Node version)
  • Config validation
  • Route discovery checks
  • Docs/example smoke checks
  • Legacy config detection
  • Mixed config style detection

Acceptance tests (start here, all failing):

  1. apophis doctor passes on healthy project
  2. Unknown config key is caught
  3. Missing @fastify/swagger is reported with install command
  4. Mixed legacy and new config shows both and recommends migrate
  5. Qualify enabled in unsafe env is caught
  6. Docs examples drift from reality fails in CI mode
  7. Monorepo with different config styles reports per package

Done when: Malformed setups fail fast and clearly.

S9: Migrate

Owner: One LLM thread.

Directory: src/cli/commands/migrate/

Responsibilities:

  • apophis migrate --check
  • apophis migrate --dry-run
  • apophis migrate --write
  • Legacy config detection
  • Exact replacement guidance
  • Comment/formatting preservation where feasible
  • Partial migration reporting

Acceptance tests (start here, all failing):

  1. apophis migrate --check detects legacy config
  2. --dry-run shows exact rewrites without writing
  3. --write performs rewrites correctly
  4. Ambiguous rewrite stops and requires manual choice
  5. Legacy field with no direct equivalent emits human guidance
  6. Partial migration reports completed and remaining items
  7. Preserves comments/formatting where feasible

Done when: Old outward contract upgrades cleanly.

S10: Renderers

Owner: One LLM thread.

Directory: src/cli/renderers/

Responsibilities:

  • Human renderer: canonical failure output, progress, summaries
  • JSON renderer: stable artifact schema
  • NDJSON renderer: step events
  • Truncation rules for large payloads
  • Color/styling with picocolors
  • No spinners in CI
  • No ANSI in --format json

Acceptance tests (start here, all failing):

  1. Human failure output matches golden snapshot exactly
  2. JSON output validates against artifact schema
  3. NDJSON emits correct event sequence
  4. Large payloads are truncated in terminal, full in artifact
  5. No ANSI in --format json
  6. No spinners when CI=true
  7. Color respects --color flag

Done when: Every command looks consistent and machine-readable.

S11: Docs + Site

Owner: One LLM thread.

Directory: docs/

Responsibilities:

  • docs/cli.md: command reference
  • docs/verify.md, docs/observe.md, docs/qualify.md: mode guides
  • docs/getting-started.md: first-signal quickstart
  • docs/llm-safe-adoption.md: scaffold and CI policy
  • Homepage behavior examples and first-signal funnel copy
  • All examples must be smoke-tested against real CLI

Acceptance tests (start here, all failing):

  1. Every code block in docs/getting-started.md runs successfully
  2. Homepage behavior example produces exact golden output
  3. All apophis commands in docs exist and have correct flags
  4. All examples use current config schema
  5. No stale legacy syntax in docs

Done when: Docs match shipped CLI exactly.

S12: Acceptance Matrix

Owner: One LLM thread.

Directory: src/test/cli/, src/cli/__fixtures__/, src/cli/__goldens__/

Responsibilities:

  • Top-level fixture apps
  • End-to-end command smoke suite
  • Latency budget checks
  • Regression harness
  • Golden snapshot management

Fixture apps required:

  1. tiny-fastify: minimal app with one route, one behavioral contract
  2. broken-behavior: app with known behavioral bug
  3. monorepo: multiple packages with different configs
  4. protocol-lab: OAuth-like multi-step flow
  5. observe-config: observe-ready app with sink config
  6. legacy-config: old-style config for migration tests

Acceptance tests (start here, all failing):

  1. All commands run against all fixture apps
  2. Golden snapshots match
  3. Latency budgets met:
    • apophis --help: < 100ms
    • apophis doctor config-only: < 3s
    • apophis init after prompts: < 500ms
    • apophis verify first progress: < 2s
    • apophis replay startup: < 500ms
  4. Regression: no command breaks another command's fixtures
  5. Exit codes are correct for every scenario

Done when: Merge gate is authoritative.

7. Red-Green-Refactor Per Stream

For every stream, follow this exact loop:

  1. Red: Write all acceptance tests. They must fail.
  2. Green: Implement the vertical slice until all tests pass.
  3. Refactor: Only after green, extract shared code if another stream needs it. Request orchestrator mediation for cross-stream extraction.

Example for S4 (Verify):

// Step 1: Red - write failing test
import { test } from 'node:test';
import assert from 'node:assert';
import { runCli } from '../helpers/run-cli.js';

test('verify finds the canonical behavioral failure', async () => {
  const result = await runCli({
    cwd: 'src/cli/__fixtures__/broken-behavior',
    args: ['verify', '--profile', 'quick', '--routes', 'POST /users']
  });
  
  assert.strictEqual(result.exitCode, 1);
  assert.match(result.stdout, /Contract violation/);
  assert.match(result.stdout, /POST \/users/);
  assert.match(result.stdout, /Replay/);
  assert.match(result.stdout, /apophis replay --artifact/);
});
// Step 2: Green - implement until it passes
// src/cli/commands/verify/index.ts
import { cac } from 'cac';
// ... implementation
// Step 3: Refactor - only if S6 also needs route filtering
// Request orchestrator to extract route-filter to src/cli/core/

8. Merge Policy

8.1 What streams can merge independently

  • Any stream can merge if:
    1. All its acceptance tests pass
    2. It does not modify orchestrator-owned files
    3. It does not modify another stream's directory
    4. It passes npm run build and npm run test:src

8.2 What requires orchestrator approval

  • Changes to src/cli/core/types.ts
  • Changes to src/cli/core/exit-codes.ts
  • Changes to src/cli/__goldens__/
  • Changes to src/cli/__fixtures__/
  • New shared extraction requests
  • Golden snapshot updates

8.3 Merge gate commands

Every PR must pass:

npm run build
npm run test:src
npm run test:cli        # S12 acceptance matrix
npm run test:cli:goldens # golden snapshot comparison
npm run test:cli:latency # latency budget checks
npm run test:docs        # docs smoke tests

9. Edge Cases Reference

Global

Edge case Expected behavior
No config found Suggest apophis init, do not crash
Multiple config candidates Print choices and exact override flag
Monorepo root vs package root Detect package boundary and say which one was chosen
Unknown config keys Hard fail with exact key path
Invalid profile name List available profiles
Preset/profile mismatch Explain mismatch, do not silently coerce
Unsupported Node/runtime Fail immediately with exact version requirement
Missing peer dependencies Report package names and install command
Non-TTY shell Disable prompts and fancy rendering automatically
CI environment No spinners, stable deterministic output
--format json with warnings Warnings go into structured fields, never stderr noise
Unwritable artifact dir Fail before run if artifacts are required
SIGINT Write partial artifact if safe, print interruption summary
Internal exception Show internal error banner plus artifact/debug path
Very large failure payload Concise terminal summary, full detail in artifact
Route path contains spaces or weird chars Always quote safely in printed commands
Dirty git tree Never block, unless command explicitly needs git diff semantics
--changed outside git repo Degrade cleanly and tell user how
Stale artifact version Explain incompatibility and fallback options

Init

Edge case Expected behavior
Existing config file Refuse overwrite unless --force, show diff or dry-run
Existing package scripts Merge carefully, do not clobber
Multiple Fastify entrypoints detected Ask or require explicit selection
Noninteractive shell with ambiguity Fail with explicit flags needed
Missing @fastify/swagger Tell user why it matters and how to add it
Package manager unknown Avoid assumptions, print generic install commands
Rerun init Idempotent or clearly update-only

Verify

Edge case Expected behavior
No routes matched Fail with route filter echo and available matches summary
No behavioral contracts found Explain that schema-only routes do not provide behavioral contracts for verify
Contract parse failure Show route, clause index, expression, migration guidance
Seed omitted Generate one and print it always
Multiple failures Stable order, compact summary, artifact for full detail
Changed-files selection empty Say no relevant routes changed
Flaky endpoint behavior Call out nondeterminism if replay diverges
Timeout Route-specific timeout in summary
Artifact write fails after run Still print failure summary and note artifact problem

Observe

Edge case Expected behavior
Blocking behavior requested in prod Hard fail unless explicit break-glass policy allows it
Invalid sampling rate Fail with exact bounds
Missing sink config Tell user what sink is required
Config would generate outage risk Fail before activation
Observe profile references qualify-only feature Hard fail

Qualify

Edge case Expected behavior
Run in prod by default Hard block
Scenario uses outbound mocks not allowed in env Hard block
Scenario form flow requires missing app support Clear diagnostic
Chaos requested on protected routes Hard block unless allowlisted
Cleanup fails after stateful run Report separately without hiding primary failure
Seed omitted Generate and print it
Too many artifacts Summarize and index them cleanly

Replay

Edge case Expected behavior
Artifact missing Fail with exact path
Artifact corrupted Explain parse/validation failure
Source code changed since artifact Warn but still attempt replay
Referenced route no longer exists Explain drift clearly
CLI version newer/older than artifact schema Compatibility message, not stack trace

Doctor

Edge case Expected behavior
Mixed legacy and new config Show both and recommend migrate
Docs examples drift from reality Fail in CI mode
Missing swagger registration Tell user whether APOPHIS can still proceed and what is degraded
Qualify enabled in unsafe env Hard fail
Multiple packages in monorepo using different config styles Report per package

Migrate

Edge case Expected behavior
Ambiguous rewrite Stop and require manual choice
Comments/formatting preservation Preserve where feasible, otherwise warn
Dry-run mode Default for safety
Legacy field removed with no direct equivalent Emit exact human guidance
Partial migration Report completed and remaining items separately

10. Latency Budgets

Command Target
apophis --help < 100ms
apophis doctor config-only < 3s
apophis init after prompts < 500ms
apophis verify first progress < 2s
apophis replay startup < 500ms

These are enforced by S12. A command that exceeds its budget fails CI.

11. First Signal Checklist

For the CLI to deliver the first useful signal, every stream must satisfy:

  • Install to first signal: under 10 minutes for normal Fastify service
  • --help clarity: user can infer product model from help text alone
  • First init: writes correct scaffold without blocking on unnecessary prompts
  • First verify: checks cross-operation behavior, not only shape
  • First failure: route, formula, observed reality, seed, replay command, artifact path
  • First replay: one copy-paste command reproduces same result
  • Trust signal: CLI explicitly shows environment gating and deterministic seed
  • Expansion path: output tells user whether to add more verify, turn on observe, or create qualify profile

12. Final Notes for Implementers

  1. Do not over-engineer shared code. Each stream should be self-contained until proven otherwise.
  2. Do not add features not in the spec. The spec is intentionally minimal.
  3. Do not optimize for polish over correctness. The useful signal is in the failure message, not the spinner.
  4. Do not skip acceptance tests. They are the contract.
  5. Do not modify orchestrator files. Request changes through the orchestrator.
  6. Do not assume another stream's timeline. Code against the spec, not against another stream's partial implementation.
  7. Do ask for clarification. The orchestrator exists to resolve ambiguity.

This document is versioned. The orchestrator will update it if contracts change. Implementation streams should pin to a version and request updates explicitly.