Files
apophis-fastify/docs/attic/QUALITY_FEATURES_PLAN.md
T

41 KiB

APOPHIS Quality Features Plan v1.2

Chaos, Flake Detection, and Mutation Testing

Status: Chaos Implemented in v1.2 | Target: Flake + Mutation in v1.3 | Priority: P0


1. Executive Summary

This plan adds three first-class quality assurance features to APOPHIS:

  1. Chaos Mode — Inject controlled failures during contract execution to validate resilience guarantees
  2. Flake Detection — Automatically rerun failing tests with varied seeds to identify non-deterministic contracts
  3. Mutation Testing — Introduce synthetic bugs into route handlers and verify contracts catch them

These features transform APOPHIS from a contract validator into a comprehensive API quality platform. They leverage APOPHIS's unique architecture: formal contract ASTs, seeded property-based generation, extension hooks, and programmatic route access.

Chaos Mode is implemented in v1.2. Flake Detection and Mutation Testing are planned for v1.3 alongside Protocol Extensions (see docs/protocol-extensions-spec.md).


2. Design Principles

2.1 Environment Guardrails

All three features run ONLY in NODE_ENV=test. This is non-negotiable. Implementation:

// src/quality/env-guard.ts (NEW FILE)
export const assertTestEnv = (feature: string): void => {
  if (process.env.NODE_ENV !== 'test') {
    throw new Error(
      `${feature} is only available in test environment. ` +
      `Set NODE_ENV=test to enable quality features.`
    )
  }
}

Cited from: src/extension/registry.ts:26-33handleHookError pattern for fatal vs warn severity.

2.2 Config Flags on contract()

All features are opt-in via TestConfig:

// src/types.ts:218-223 (CURRENT)
export interface TestConfig {
  readonly depth?: TestDepth
  readonly scope?: string
  readonly seed?: number
  readonly timeout?: number
}

Extended to:

// src/types.ts:218-230 (PLANNED)
export interface TestConfig {
  readonly depth?: TestDepth
  readonly scope?: string
  readonly seed?: number
  readonly timeout?: number
  readonly chaos?: ChaosConfig        // Phase 1
  readonly mutation?: MutationConfig  // Phase 3
}

Flake does NOT appear in config — it is automatic on any test failure.

2.3 Red-Green-Refactor per Phase

Each phase follows strict RGR:

  • Red: Write failing test that exercises the feature
  • Green: Implement minimal code to pass
  • Refactor: Extract patterns, deduplicate, add types

Parallelization strategy: Phase 1 (Chaos) and test infrastructure can be developed in parallel. Phase 2 (Flake) depends on Phase 1's runner modifications. Phase 3 (Mutation) is independent after Phase 1.


3. Current Architecture Analysis

3.1 Test Runner Entry Points

File: src/plugin/index.ts:48-69

const buildContract = (fastify, scope, extensionRegistry) => async (opts = {}) => {
  const config = {
    depth: opts.depth ?? 'standard',
    scope: opts.scope,
    seed: opts.seed,
  }
  const suite = await runPetitTests(injectInstance, config, scope, extensionRegistry)
  // ... empty discovery check ...
  return suite
}

This is the primary entry point for all contract testing. It delegates to runPetitTests.

3.2 Core Runner Loop

File: src/test/petit-runner.ts:166-360

Key sections:

  • Line 166: runPetitTests() signature — accepts TestConfig, ScopeRegistry, ExtensionRegistry
  • Line 176-178: Extension suite start hooks
  • Line 248-261: Request building with extension hooks
  • Line 263-274: runBeforeRequestHooksChaos injection point
  • Line 276-278: executeHttp() call — Chaos delay/error injection point
  • Line 282-290: runAfterRequestHooks
  • Line 301-306: validatePostconditions()Flake rerun trigger point
  • Line 307-333: Failure handling — Flake auto-rerun entry
  • Line 361-362: Cache flush

3.3 HTTP Execution

File: src/infrastructure/http-executor.ts:63-145

export const executeHttp = async (
  fastify: FastifyInjectInstance,
  route: RouteContract,
  request: RequestStructure,
  previous?: EvalContext,
  timeoutMs?: number
): Promise<EvalContext> => {
  // Line 85: const startTime = Date.now()
  // Line 86: let timedOut = false
  // Line 103-116: Promise.race with timeout
  // Line 117-144: Error handling (including timeout context return)
}

Critical for Chaos: The timeout mechanism (lines 104-113) uses Promise.race. Chaos delays must be injected BEFORE this race, or they must extend the timeout window.

3.4 Extension Hook System

File: src/extension/types.ts:145-174

readonly onBuildRequest?: (context: RequestBuildContext) => RequestStructure | Promise<RequestStructure | undefined> | undefined
readonly onBeforeRequest?: (context: ExecutionContext) => Promise<void>
readonly onAfterRequest?: (context: ExecutionContext) => Promise<void>
readonly onSuiteStart?: (config: TestConfig) => Promise<Record<string, unknown> | undefined> | Record<string, unknown> | undefined
readonly onSuiteEnd?: (suite: TestSuite, extensionState: Record<string, unknown>) => Promise<void>

File: src/extension/registry.ts:268-294

async runBuildRequestHooks(context: RequestBuildContext): Promise<RequestStructure> {
  let request = context.request
  for (const ext of this._buildRequestExts) {
    // Line 278: const result = await withTimeout(Promise.resolve(hook(hookContext)), ...)
    // Line 285: if (result !== undefined) request = result
  }
  return request
}

3.5 Result Types

File: src/types.ts:254-291

export interface TestResult {
  readonly ok: boolean
  readonly name: string
  readonly id: number
  readonly directive?: string
  readonly diagnostics?: TestDiagnostics
}

export interface TestDiagnostics {
  readonly error?: string
  readonly violation?: ContractViolation
  readonly suggestion?: string
  readonly formula?: string
  readonly counterexample?: string
}

export interface TestSuite {
  readonly tests: ReadonlyArray<TestResult>
  readonly summary: TestSummary
  readonly routes: ReadonlyArray<RouteDisposition>
}

4. Phase 1: Chaos Mode

4.1 Overview

Chaos mode injects controlled failures during contract execution:

  • Delays: Add latency to requests (test timeout contracts)
  • Errors: Force specific HTTP status codes (test error-handling contracts)
  • Dropouts: Simulate network failures (test retry/timeout contracts)

Key insight: Chaos is NOT an extension. It is a runner mode that wraps executeHttp. This distinction matters because:

  • Extensions are user-provided; Chaos is built-in
  • Extensions run in dependency order; Chaos must run at a specific point in the HTTP lifecycle
  • Extensions can be disabled via health checks; Chaos is controlled purely by config

4.2 Config Type

File: src/types.ts (NEW SECTION after line 223)

// Line 224+ (NEW)
export interface ChaosConfig {
  /** Probability of injecting any chaos event (0.0 - 1.0) */
  readonly probability: number
  /** Delay injection: add artificial latency */
  readonly delay?: {
    readonly probability: number  // Conditional on chaos.probability
    readonly minMs: number
    readonly maxMs: number
  }
  /** Error injection: force HTTP error responses */
  readonly error?: {
    readonly probability: number
    readonly statusCode: number   // e.g., 503
    readonly body?: unknown
  }
  /** Dropout injection: simulate network failure */
  readonly dropout?: {
    readonly probability: number
  }
}

4.3 Chaos Engine Implementation

File: src/quality/chaos.ts (NEW FILE)

/**
 * Chaos Engineering Engine for APOPHIS
 * 
 * Injects controlled failures into the HTTP execution pipeline.
 * Uses a seeded RNG for reproducible chaos events.
 * 
 * Architecture: Chaos is a runner concern, not an extension.
 * It wraps executeHttp at the call site in petit-runner.ts.
 */

import type { ChaosConfig, EvalContext, RouteContract } from '../types.js'
import type { RequestStructure } from '../domain/request-builder.js'
import { SeededRng } from '../infrastructure/seeded-rng.js'
import { getErrorMessage } from '../infrastructure/security.js'

export interface ChaosEvent {
  readonly type: 'delay' | 'error' | 'dropout'
  readonly injected: boolean
  readonly details?: {
    readonly delayMs?: number
    readonly statusCode?: number
    readonly reason: string
  }
}

export class ChaosEngine {
  private rng: SeededRng
  private events: ChaosEvent[] = []
  private config: ChaosConfig

  constructor(config: ChaosConfig, seed?: number) {
    this.config = config
    // Derive chaos seed from test seed for reproducibility
    this.rng = new SeededRng(seed !== undefined ? seed + 0xCA05 : Date.now())
  }

  /**
   * Wrap executeHttp with chaos injection.
   * Returns the original EvalContext or a modified one based on chaos events.
   */
  async executeWithChaos(
    executeHttp: () => Promise<EvalContext>,
    route: RouteContract,
    request: RequestStructure
  ): Promise<{ ctx: EvalContext; events: ChaosEvent[] }> {
    this.events = []

    // Global probability gate
    if (!this.shouldInject(this.config.probability)) {
      const ctx = await executeHttp()
      return { ctx, events: this.events }
    }

    // Delay injection (before HTTP call)
    if (this.config.delay && this.shouldInject(this.config.delay.probability)) {
      const delayMs = this.randomDelay()
      this.events.push({
        type: 'delay',
        injected: true,
        details: { delayMs, reason: `Chaos delay: ${delayMs}ms` },
      })
      await this.sleep(delayMs)
    }

    // Dropout injection (skip HTTP call entirely)
    if (this.config.dropout && this.shouldInject(this.config.dropout.probability)) {
      this.events.push({
        type: 'dropout',
        injected: true,
        details: { reason: 'Chaos dropout: network failure simulated' },
      })
      return {
        ctx: this.buildDropoutContext(route, request),
        events: this.events,
      }
    }

    // Execute the actual HTTP call
    let ctx: EvalContext
    try {
      ctx = await executeHttp()
    } catch (err) {
      // Error injection: wrap actual errors in chaos context
      if (this.config.error && this.shouldInject(this.config.error.probability)) {
        this.events.push({
          type: 'error',
          injected: true,
          details: {
            statusCode: this.config.error.statusCode,
            reason: `Chaos error: forced ${this.config.error.statusCode}`,
          },
        })
        return {
          ctx: this.buildErrorContext(route, request, this.config.error.statusCode, this.config.error.body),
          events: this.events,
        }
      }
      throw err
    }

    // Error injection: override successful responses
    if (this.config.error && this.shouldInject(this.config.error.probability)) {
      this.events.push({
        type: 'error',
        injected: true,
        details: {
          statusCode: this.config.error.statusCode,
          reason: `Chaos error: overridden ${ctx.response.statusCode} with ${this.config.error.statusCode}`,
        },
      })
      return {
        ctx: this.buildErrorContext(route, request, this.config.error.statusCode, this.config.error.body, ctx),
        events: this.events,
      }
    }

    return { ctx, events: this.events }
  }

  private shouldInject(probability: number): boolean {
    return this.rng.next() < probability
  }

  private randomDelay(): number {
    if (!this.config.delay) return 0
    const min = this.config.delay.minMs
    const max = this.config.delay.maxMs
    return min + Math.floor(this.rng.next() * (max - min + 1))
  }

  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms))
  }

  private buildDropoutContext(route: RouteContract, request: RequestStructure): EvalContext {
    return {
      request: {
        body: request.body,
        headers: request.headers,
        query: request.query || {},
        params: {},
        multipart: request.multipart,
      },
      response: {
        body: { error: 'Chaos dropout: network failure simulated' },
        headers: {},
        statusCode: 0,
        responseTime: 0,
      },
      previous: undefined,
      timedOut: false,
      timeoutMs: undefined,
      redirects: [],
    }
  }

  private buildErrorContext(
    route: RouteContract,
    request: RequestStructure,
    statusCode: number,
    body?: unknown,
    originalCtx?: EvalContext
  ): EvalContext {
    return {
      request: {
        body: request.body,
        headers: request.headers,
        query: request.query || {},
        params: originalCtx?.request.params ?? {},
        multipart: request.multipart,
      },
      response: {
        body: body ?? { error: `Chaos error: forced ${statusCode}` },
        headers: originalCtx?.response.headers ?? {},
        statusCode,
        responseTime: originalCtx?.response.responseTime ?? 0,
      },
      previous: originalCtx?.previous,
      timedOut: false,
      timeoutMs: originalCtx?.timeoutMs,
      redirects: originalCtx?.redirects ?? [],
    }
  }
}

4.4 Runner Integration

File: src/test/petit-runner.ts (MODIFICATIONS)

Line 30 (after existing imports):

import { ChaosEngine } from '../quality/chaos.js'
import { assertTestEnv } from '../quality/env-guard.js'

Line 276-278 (CURRENT):

const timeoutMs = command.route.timeout ?? config.timeout
ctx = await executeHttp(fastify, command.route, request, previousCtx, timeoutMs)

Line 276-290 (MODIFIED):

const timeoutMs = command.route.timeout ?? config.timeout

// Phase 1: Chaos Mode
let chaosEvents: ChaosEvent[] = []
if (config.chaos) {
  assertTestEnv('Chaos mode')
  const chaos = new ChaosEngine(config.chaos, config.seed)
  const result = await chaos.executeWithChaos(
    () => executeHttp(fastify, command.route, request, previousCtx, timeoutMs),
    command.route,
    request
  )
  ctx = result.ctx
  chaosEvents = result.events
} else {
  ctx = await executeHttp(fastify, command.route, request, previousCtx, timeoutMs)
}

Line 307-333 (failure handling, ADD chaosEvents to diagnostics):

if (!post.success) {
  const diagnostics: Record<string, unknown> = {
    statusCode: ctx.response.statusCode,
    error: post.error,
  }
  
  // Phase 1: Include chaos events in failure diagnostics
  if (chaosEvents.length > 0) {
    diagnostics.chaosEvents = chaosEvents
  }
  
  if (post.violation) {
    // ... existing violation handling ...
  }
  
  results.push({
    ok: false,
    name,
    id: testId,
    diagnostics,
  })
}

4.5 Plugin Integration

File: src/plugin/index.ts:48-55 (MODIFIED)

const buildContract = (fastify, scope, extensionRegistry) => async (opts = {}) => {
  const config = {
    depth: opts.depth ?? 'standard',
    scope: opts.scope,
    seed: opts.seed,
    chaos: opts.chaos,           // Phase 1: Pass through chaos config
  }
  // ... rest unchanged ...
}

4.6 Test Plan

File: src/test/chaos.test.ts (NEW FILE)

Test cases (red-green-refactor):

// RED (fails without implementation)
test('chaos: delay injection adds latency', async () => {
  const fastify = await withApophisApp()
  // Setup route with timeout contract
  // Run with chaos.delay
  // Assert timeout_occurred(this) == true passes
})

// GREEN (minimal implementation)
// Implement ChaosEngine.sleep and delay injection

// RED
 test('chaos: error injection forces status code', async () => {
  // Setup route with contract status:200
  // Run with chaos.error: { statusCode: 503 }
  // Assert contract fails with status 503
})

// GREEN
// Implement error injection in ChaosEngine

// RED
test('chaos: dropout returns status 0', async () => {
  // Run with chaos.dropout
  // Assert statusCode is 0
})

// GREEN
// Implement dropout in ChaosEngine

// RED
test('chaos: respects probability gate', async () => {
  // Run 100 times with probability 0
  // Assert no chaos events
})

// GREEN
// Implement probability gate

// REFACTOR
// Extract shared test setup, add property-based tests for probability distribution

4.7 Parallelization

Chaos Phase can be parallelized with:

  • Test infrastructure setup (src/test/chaos.test.ts scaffold)
  • Documentation (docs/chaos.md)
  • Type definitions (src/types.ts updates)

5. Phase 2: Flake Detection

5.1 Overview

Flake detection automatically reruns failing tests with varied seeds to confirm the failure is deterministic. It also detects "shrinks that go green" — where a property test finds a counterexample but the simplified version passes.

Key insight: Flake is NOT a config flag. It is an automatic behavior triggered by:

  1. Any test result with ok: false
  2. Any contract violation

This makes it impossible to accidentally ship a flaky contract.

5.2 Flake Rerun Strategy

When a test fails:

  1. Immediate rerun: Run the exact same command with the same seed
    • If it passes → FLAKE DETECTED (likely order-dependent or time-dependent)
    • If it fails → proceed to step 2
  2. Seed variation: Run with seed + 1, seed + 2, ..., seed + N
    • If any pass → FLAKE DETECTED (seed-dependent non-determinism)
    • If all fail → CONFIRMED FAILURE (deterministic bug)

Default: N = 3 reruns (4 total runs including original)

5.3 Flake Engine Implementation

File: src/quality/flake.ts (NEW FILE)

/**
 * Flake Detection Engine for APOPHIS
 * 
 * Automatically reruns failing tests with varied seeds to detect
 * non-deterministic contracts. Flake detection is automatic — no config required.
 * 
 * Triggered by: any test result with ok: false
 * Strategy: same-seed rerun + seed-variation runs
 */

import type { TestResult, TestConfig, EvalContext, RouteContract } from '../types.js'
import { assertTestEnv } from './env-guard.js'

export interface FlakeReport {
  readonly originalResult: TestResult
  readonly reruns: FlakeRerun[]
  readonly isFlaky: boolean
  readonly confidence: 'high' | 'medium' | 'low'
}

export interface FlakeRerun {
  readonly seed: number
  readonly passed: boolean
  readonly ctx?: EvalContext
}

export interface FlakeOptions {
  /** Number of additional seeds to try (default: 3) */
  readonly seedVariations?: number
  /** Number of same-seed reruns (default: 1) */
  readonly sameSeedReruns?: number
}

const DEFAULT_OPTIONS: Required<FlakeOptions> = {
  seedVariations: 3,
  sameSeedReruns: 1,
}

export class FlakeDetector {
  private options: Required<FlakeOptions>

  constructor(options: FlakeOptions = {}) {
    this.options = { ...DEFAULT_OPTIONS, ...options }
  }

  /**
   * Analyze a failing test by rerunning it.
   * Returns a FlakeReport indicating whether the failure is deterministic.
   */
  async detectFlake(
    originalResult: TestResult,
    rerunFn: (seed?: number) => Promise<{ passed: boolean; ctx?: EvalContext }>,
    originalSeed?: number
  ): Promise<FlakeReport> {
    assertTestEnv('Flake detection')

    const reruns: FlakeRerun[] = []
    let isFlaky = false

    // Same-seed reruns
    for (let i = 0; i < this.options.sameSeedReruns; i++) {
      const result = await rerunFn(originalSeed)
      reruns.push({ seed: originalSeed ?? 0, passed: result.passed, ctx: result.ctx })
      if (result.passed) {
        isFlaky = true
      }
    }

    // Seed-variation reruns
    const baseSeed = originalSeed ?? Date.now()
    for (let i = 1; i <= this.options.seedVariations; i++) {
      const variedSeed = baseSeed + i
      const result = await rerunFn(variedSeed)
      reruns.push({ seed: variedSeed, passed: result.passed, ctx: result.ctx })
      if (result.passed) {
        isFlaky = true
      }
    }

    // Confidence scoring
    const passCount = reruns.filter(r => r.passed).length
    const confidence = passCount === 0 ? 'high' : passCount >= reruns.length / 2 ? 'low' : 'medium'

    return {
      originalResult,
      reruns,
      isFlaky,
      confidence,
    }
  }
}

5.4 Runner Integration

File: src/test/petit-runner.ts (MODIFICATIONS)

Line 307-333 (failure handling, CURRENT):

if (!post.success) {
  const diagnostics: Record<string, unknown> = {
    statusCode: ctx.response.statusCode,
    error: post.error,
  }
  // ... violation handling ...
  results.push({ ok: false, name, id: testId, diagnostics })
}

Line 307-360 (MODIFIED with Flake):

if (!post.success) {
  // Phase 2: Flake Detection — rerun failing test
  const flakeReport = await this.detectFlake(
    { ok: false, name, id: testId, diagnostics: { error: post.error } },
    async (seed) => {
      // Rebuild request with new seed
      const rerunRequest = buildRequest(command.route, command.params, scopeHeaders, state, seed !== undefined ? new SeededRng(seed) : undefined)
      const rerunCtx = await executeHttp(fastify, command.route, rerunRequest, previousCtx, timeoutMs)
      const rerunPost = validatePostconditions(command.route.ensures, rerunCtx, command.route, extensionRegistry)
      return { passed: rerunPost.success, ctx: rerunCtx }
    },
    config.seed
  )

  const diagnostics: Record<string, unknown> = {
    statusCode: ctx.response.statusCode,
    error: post.error,
  }

  // Include flake report if flaky
  if (flakeReport.isFlaky) {
    diagnostics.flake = {
      isFlaky: true,
      confidence: flakeReport.confidence,
      reruns: flakeReport.reruns.map(r => ({
        seed: r.seed,
        passed: r.passed,
        statusCode: r.ctx?.response.statusCode,
      })),
      suggestion: 'This contract failure is non-deterministic. Check for: time-dependent values, uninitialized state, race conditions, or external service calls.',
    }
  }

  // ... existing violation handling ...

  results.push({
    ok: false,
    name: flakeReport.isFlaky ? `${name} [FLAKY]` : name,
    id: testId,
    diagnostics,
  })
}

5.5 Plugin Integration

No plugin changes needed. Flake is fully automatic.

5.6 Test Plan

File: src/test/flake.test.ts (NEW FILE)

// RED
test('flake: detects time-dependent contract', async () => {
  // Route that returns { timestamp: Date.now() }
  // Contract: response_body(this).timestamp > 0
  // First run passes, rerun might get different timestamp
  // Assert flake report shows isFlaky: true
})

// GREEN
// Implement basic FlakeDetector

// RED
test('flake: confirms deterministic failure', async () => {
  // Route that always returns 500
  // Contract: status:200
  // Assert all reruns fail, isFlaky: false
})

// GREEN
// Implement rerun logic

// RED
test('flake: varies seed for reruns', async () => {
  // Route with seeded random response
  // Assert reruns use different seeds
})

// GREEN
// Implement seed variation

// REFACTOR
// Extract shared setup, add tests for confidence scoring

5.7 Parallelization

Flake Phase can be developed in parallel with:

  • Chaos Phase (if runner modifications are coordinated)
  • Mutation Phase (type definitions)
  • Documentation

6. Phase 3: Mutation Testing

6.1 Overview

Mutation testing introduces synthetic bugs into route handlers and verifies that contracts catch them. It answers: "Are my contracts strong enough to detect real bugs?"

Key insight: Mutation is an ASSESSMENT tool, not a test mode. It runs AFTER normal contract testing and reports a score. It requires AST parsing of route handlers.

6.2 Config Type

File: src/types.ts (EXTENDED from Phase 1)

export interface MutationConfig {
  /** Mutators to apply */
  readonly mutators?: MutationType[]
  /** Max mutations per route (default: 5) */
  readonly maxPerRoute?: number
  /** Stop after first surviving mutation per route (default: false) */
  readonly stopOnSurvivor?: boolean
}

export type MutationType = 
  | 'status-code'        // Change status(201) to status(200)
  | 'operator-swap'      // Change > to <, === to !==
  | 'field-delete'       // Remove field from return object
  | 'null-response'      // Return null instead of object
  | 'boolean-flip'       // Flip boolean values
  | 'string-corrupt'     // Corrupt string responses

export interface MutationReport {
  readonly score: number           // 0.0 - 1.0 (percentage killed)
  readonly total: number
  readonly killed: number
  readonly survived: number
  readonly mutants: MutantResult[]
  readonly weakRoutes: string[]    // Routes with <50% kill rate
}

export interface MutantResult {
  readonly route: string
  readonly method: string
  readonly type: MutationType
  readonly description: string
  readonly killed: boolean
  readonly killedBy?: string      // Formula that caught it
  readonly error?: string
}

6.3 Mutation Engine Implementation

File: src/quality/mutation.ts (NEW FILE)

/**
 * Mutation Testing Engine for APOPHIS
 * 
 * Introduces synthetic bugs into route handlers and verifies
 * that x-ensures contracts catch them.
 * 
 * Architecture: Mutator transforms handler AST, runner executes
 * contracts against mutated handler, original is restored.
 */

import type { FastifyInstance } from 'fastify'
import type { 
  MutationConfig, 
  MutationType, 
  MutationReport, 
  MutantResult,
  TestConfig,
  TestSuite,
  RouteContract 
} from '../types.js'
import { assertTestEnv } from './env-guard.js'
import { runPetitTests } from '../test/petit-runner.js'

export class MutationEngine {
  private config: Required<MutationConfig>
  private mutants: MutantResult[] = []

  constructor(config: MutationConfig = {}) {
    this.config = {
      mutators: config.mutators ?? ['status-code', 'operator-swap', 'field-delete'],
      maxPerRoute: config.maxPerRoute ?? 5,
      stopOnSurvivor: config.stopOnSurvivor ?? false,
    }
  }

  async run(
    fastify: FastifyInstance,
    routes: RouteContract[],
    baseConfig: TestConfig
  ): Promise<MutationReport> {
    assertTestEnv('Mutation testing')
    this.mutants = []

    for (const route of routes) {
      await this.mutateRoute(fastify, route, baseConfig)
    }

    const killed = this.mutants.filter(m => m.killed).length
    const total = this.mutants.length

    return {
      score: total === 0 ? 1.0 : killed / total,
      total,
      killed,
      survived: total - killed,
      mutants: this.mutants,
      weakRoutes: this.findWeakRoutes(),
    }
  }

  private async mutateRoute(
    fastify: FastifyInstance,
    route: RouteContract,
    baseConfig: TestConfig
  ): Promise<void> {
    const handler = this.extractHandler(fastify, route)
    if (!handler) return

    let mutationCount = 0

    for (const mutatorType of this.config.mutators) {
      if (mutationCount >= this.config.maxPerRoute) break

      const mutant = await this.applyMutation(fastify, route, handler, mutatorType, baseConfig)
      if (mutant) {
        this.mutants.push(mutant)
        mutationCount++

        if (this.config.stopOnSurvivor && !mutant.killed) {
          break // Stop after first survivor for this route
        }
      }
    }
  }

  private extractHandler(fastify: FastifyInstance, route: RouteContract): Function | null {
    // Access Fastify's internal route handler
    // Fastify stores routes in fastify[Symbol.for('fastify.route')]
    // This is implementation-dependent and may require reflection
    const routes = (fastify as unknown as { [key: symbol]: unknown })[Symbol.for('fastify.routes')]
    // ... handler extraction logic ...
    return null // Placeholder
  }

  private async applyMutation(
    fastify: FastifyInstance,
    route: RouteContract,
    originalHandler: Function,
    type: MutationType,
    baseConfig: TestConfig
  ): Promise<MutantResult | null> {
    // Create mutated handler
    const mutatedHandler = this.createMutant(originalHandler, type)
    if (!mutatedHandler) return null

    // Replace handler temporarily
    this.replaceHandler(fastify, route, mutatedHandler)

    try {
      // Run contract test on mutated handler
      const suite = await runPetitTests(
        fastify as unknown as import('../types.js').FastifyInjectInstance,
        { ...baseConfig, depth: 'quick' }, // Quick depth for speed
        undefined,
        undefined
      )

      const failed = suite.tests.some(t => !t.ok)

      return {
        route: route.path,
        method: route.method,
        type,
        description: this.describeMutation(type),
        killed: failed,
        killedBy: failed ? this.findKillingFormula(suite) : undefined,
      }
    } finally {
      // Restore original handler
      this.replaceHandler(fastify, route, originalHandler)
    }
  }

  private createMutant(handler: Function, type: MutationType): Function | null {
    // AST-based mutation using meriyah or acorn
    // This is the complex part — parse handler, mutate AST, regenerate
    switch (type) {
      case 'status-code':
        return this.mutateStatusCode(handler)
      case 'operator-swap':
        return this.mutateOperators(handler)
      case 'field-delete':
        return this.mutateFieldDelete(handler)
      // ... etc
    }
    return null
  }

  private mutateStatusCode(handler: Function): Function | null {
    // Parse handler to string
    const source = handler.toString()
    // Find status(code) calls and decrement by 1
    const mutated = source.replace(/status\((\d+)\)/g, (_, code) => `status(${Number(code) - 1})`)
    if (mutated === source) return null
    // Evaluate mutated function
    return new Function('return ' + mutated)()
  }

  private describeMutation(type: MutationType): string {
    const descriptions: Record<MutationType, string> = {
      'status-code': 'Decremented HTTP status code by 1',
      'operator-swap': 'Swapped comparison operator',
      'field-delete': 'Removed field from response object',
      'null-response': 'Replaced response with null',
      'boolean-flip': 'Flipped boolean value',
      'string-corrupt': 'Corrupted string response',
    }
    return descriptions[type]
  }

  private findKillingFormula(suite: TestSuite): string | undefined {
    const failure = suite.tests.find(t => !t.ok && t.diagnostics?.formula)
    return failure?.diagnostics?.formula
  }

  private findWeakRoutes(): string[] {
    const routeScores = new Map<string, { killed: number; total: number }>()
    
    for (const mutant of this.mutants) {
      const key = `${mutant.method} ${mutant.route}`
      const current = routeScores.get(key) ?? { killed: 0, total: 0 }
      current.total++
      if (mutant.killed) current.killed++
      routeScores.set(key, current)
    }

    return Array.from(routeScores.entries())
      .filter(([_, scores]) => scores.killed / scores.total < 0.5)
      .map(([route]) => route)
  }

  private replaceHandler(fastify: FastifyInstance, route: RouteContract, handler: Function): void {
    // Fastify handler replacement logic
    // This requires internal Fastify APIs and is implementation-dependent
  }
}

6.4 Plugin Integration

File: src/plugin/index.ts (NEW METHOD)

Line 130+ (after existing methods):

const buildMutate = (fastify: FastifyInstance, scope: ScopeRegistry, extensionRegistry: ExtensionRegistry) => async (opts: TestConfig & { mutation?: MutationConfig } = {}): Promise<MutationReport> => {
  assertTestEnv('Mutation testing')
  
  const config = {
    depth: opts.depth ?? 'standard',
    scope: opts.scope,
    seed: opts.seed,
  }
  
  const routes = discoverRoutes(fastify as unknown as { routes?: Array<{ method: string; url: string; schema?: Record<string, unknown> }> })
    .filter(r => r.category !== 'utility')
  
  const engine = new MutationEngine(opts.mutation)
  return engine.run(fastify, routes, config)
}

Line 160+ (attach to fastify.apophis):

fastify.decorate('apophis', {
  contract: buildContract(fastify, scopeRegistry, extensionRegistry),
  stateful: buildStateful(fastify, scopeRegistry, cleanupManager, extensionRegistry),
  check: buildCheck(fastify, scopeRegistry, extensionRegistry),
  cleanup: buildCleanup(cleanupManager),
  spec: buildSpec(fastify),
  scope: scopeRegistry,
  mutate: buildMutate(fastify, scopeRegistry, extensionRegistry), // Phase 3
})

6.5 Test Plan

File: src/test/mutation.test.ts (NEW FILE)

// RED
test('mutation: kills status code mutation', async () => {
  // Route returns status 201
  // Contract: status:201
  // Mutate status to 200
  // Assert mutation is killed (contract fails)
})

// GREEN
// Implement basic status-code mutator

// RED
test('mutation: survives weak contract', async () => {
  // Route returns { id: '123' }
  // Contract: status:201 (doesn't check body)
  // Mutate to return { id: null }
  // Assert mutation survives (contract passes)
})

// GREEN
// Implement field-delete mutator

// RED
test('mutation: reports weak routes', async () => {
  // Route with no body checks
  // Assert weakRoutes includes the route
})

// GREEN
// Implement weak route detection

// REFACTOR
// Extract AST parsing utilities, add operator-swap mutator

6.6 AST Parsing Strategy

Mutation requires parsing route handler source code. Options:

  1. meriyah (recommended): ES2020 parser, lightweight, no dependencies
  2. acorn: Mature, plugin ecosystem
  3. Babel: Heavy but powerful

Implementation (src/quality/ast-mutator.ts):

import { parseScript } from 'meriyah'
import { generate } from 'astring'

export class AstMutator {
  mutate(source: string, type: MutationType): string | null {
    const ast = parseScript(source, { module: true })
    // Walk AST, apply mutation
    // Return regenerated source
    return generate(ast)
  }
}

6.7 Parallelization

Mutation Phase can be parallelized with:

  • Documentation
  • AST parser selection and setup
  • CLI reporter design

7. File Structure

src/
  quality/                          # NEW DIRECTORY
    env-guard.ts                    # Environment assertions
    chaos.ts                        # Phase 1: ChaosEngine
    flake.ts                        # Phase 2: FlakeDetector
    mutation.ts                     # Phase 3: MutationEngine
    ast-mutator.ts                  # Phase 3: AST parsing
    index.ts                        # Public exports
  test/
    chaos.test.ts                   # Phase 1 tests
    flake.test.ts                   # Phase 2 tests
    mutation.test.ts                # Phase 3 tests
    helpers.ts                      # Shared test utilities (exists)
  types.ts                          # Extended TestConfig
  plugin/index.ts                   # New .mutate() method
  test/petit-runner.ts              # Chaos + Flake integration

8. Test Strategy

8.1 Unit Tests (per module)

Chaos Engine (src/test/chaos.test.ts):

  • Probability gate (deterministic with seed)
  • Delay injection timing
  • Error injection overrides
  • Dropout context structure
  • Event recording

Flake Detector (src/test/flake.test.ts):

  • Same-seed rerun detection
  • Seed variation logic
  • Confidence scoring
  • Integration with runner diagnostics

Mutation Engine (src/test/mutation.test.ts):

  • Status code mutation
  • Operator swap mutation
  • Field deletion mutation
  • Handler restoration
  • Score calculation

8.2 Integration Tests

File: src/test/quality-integration.test.ts (NEW)

test('quality: chaos + flake combo', async () => {
  // Run contract with chaos
  // Failing test triggers flake detection
  // Assert flake report includes chaos events
})

test('quality: mutation score reflects contract strength', async () => {
  // Route with strong contracts (checks body + status)
  // Run mutation
  // Assert score > 0.8
  
  // Route with weak contracts (status only)
  // Run mutation
  // Assert score < 0.5
})

8.3 Regression Tests

All existing 482 tests must pass unchanged. Quality features are:

  • Opt-in via config (Chaos, Mutation)
  • Automatic in test env only (Flake)
  • Backward compatible: TestConfig gains optional fields

9. Implementation Schedule

v1.2: Chaos (Phase 1) — COMPLETED

  • src/types.ts: Add ChaosConfig
  • src/quality/env-guard.ts: Environment assertions
  • src/quality/chaos.ts: ChaosEngine class
  • src/quality/corruption.ts: Content-type aware corruption
  • src/test/petit-runner.ts: Wrap executeHttp with Chaos
  • src/plugin/index.ts: Pass chaos config through
  • src/test/chaos.test.ts: 21 tests
  • Documentation: docs/chaos.md

v1.3: Flake (Phase 2) — PLANNED

Day 1-2: Core

  • src/quality/flake.ts: FlakeDetector class (2 hours)

Day 3: Runner Integration

  • src/test/petit-runner.ts: Auto-rerun on failure (1.5 hours)

Day 4: Tests

  • src/test/flake.test.ts: Full test suite (2 hours)

Day 5: Refactor

  • Performance optimization (parallel reruns)
  • Documentation: docs/flake.md

v1.3: Mutation (Phase 3) — PLANNED

Day 1-2: AST Infrastructure

  • src/quality/ast-mutator.ts: Parser + mutators (3 hours)
  • src/quality/mutation.ts: MutationEngine (2 hours)

Day 3: Plugin Integration

  • src/plugin/index.ts: .mutate() method (1 hour)

Day 4: Tests

  • src/test/mutation.test.ts: Full test suite (3 hours)

Day 5: Integration + Polish

  • src/test/quality-integration.test.ts (1 hour)
  • Documentation: docs/mutation.md

Parallel Workstreams

  • Protocol Extensions: JWT, Time Control, Stateful predicates (see docs/protocol-extensions-spec.md)
  • Documentation: Can be written alongside implementation
  • CLI reporter: Enhance error output for quality features
  • Performance: Benchmark mutation runs (target: <5s per route)

10. Risk Analysis

10.1 Technical Risks

Risk Probability Impact Mitigation
AST parsing breaks on complex handlers Medium High Start with simple mutators, fallback to regex
Fastify internal API changes Low High Use public APIs where possible, version pin
Flake reruns slow CI significantly Medium Medium Make reruns configurable, default to 1
Chaos delays cause false timeouts Medium Medium Adjust timeout calculation for delays

10.2 Design Decisions

Decision: Chaos as runner mode vs extension

  • Chosen: Runner mode (direct executeHttp wrapper)
  • Rationale: More control over injection timing, no dependency on extension health checks

Decision: Flake auto-run vs config flag

  • Chosen: Auto-run on failures
  • Rationale: Zero-config, prevents shipping flaky contracts

Decision: Mutation as separate method vs config flag

  • Chosen: Separate fastify.apophis.mutate() method
  • Rationale: Different mental model (assessment vs testing), different result type

11. Acceptance Criteria

11.1 Chaos

  • contract({ chaos: { probability: 0.1, delay: { minMs: 100, maxMs: 500 } } }) runs successfully
  • Chaos events appear in test diagnostics
  • Delays, errors, and dropouts all inject correctly
  • Seeded RNG makes chaos reproducible
  • Only runs in NODE_ENV=test

11.2 Flake

  • Failing tests automatically rerun with same seed
  • Seed variations detect non-determinism
  • Flake report includes confidence score
  • No config required (always on in test)
  • Does not affect passing tests

11.3 Mutation

  • mutate() returns score 0.0-1.0
  • Status-code mutator works on simple handlers
  • Field-delete mutator removes response fields
  • Original handlers are restored after mutation
  • Weak routes identified correctly
  • Only runs in NODE_ENV=test

12. Documentation Plan

12.1 User Documentation

File: docs/chaos.md

  • Chaos mode overview
  • Config examples
  • Interpreting chaos events
  • Best practices (gradual probability increase)

File: docs/flake.md

  • Flake detection overview
  • Understanding flake reports
  • Fixing flaky contracts

File: docs/mutation.md

  • Mutation testing overview
  • Interpreting mutation scores
  • Writing contracts that catch bugs
  • Weak route remediation

12.2 API Documentation

Update README.md:

  • Add Chaos section with example
  • Add Flake section (auto-run behavior)
  • Add Mutation section with score example

13. Success Metrics

Metric Target Measurement
Chaos injection accuracy >95% Unit tests
Flake detection rate >90% Synthetic flaky tests
Mutation kill rate (example API) >80% Demo project
Test suite runtime (with flake) <2x baseline Benchmark
Code coverage (quality/) >90% npx c8

14. References

Codebase Citations

  1. TestConfig: src/types.ts:218-223
  2. TestResult/TestSuite: src/types.ts:254-291
  3. runPetitTests: src/test/petit-runner.ts:166-428
  4. executeHttp: src/infrastructure/http-executor.ts:63-145
  5. Extension hooks: src/extension/types.ts:145-174
  6. Extension registry: src/extension/registry.ts:268-294
  7. Plugin entry: src/plugin/index.ts:48-69
  8. Environment: process.env.NODE_ENV (standard Node.js)

External Dependencies (Potential)

  • meriyah: ES2020 parser for AST mutation
  • astring: AST to code generator
  • acorn: Alternative parser (if meriyah insufficient)

15. Approval Checklist

  • Architecture reviewed
  • File structure agreed
  • API surface approved (config vs methods)
  • Test strategy accepted
  • Schedule realistic
  • Risk analysis complete
  • Documentation plan approved

Document Version: 1.0 Author: APOPHIS Architecture Team Date: 2026-04-25