# APOPHIS Quality Features Plan v1.2 ## Chaos, Flake Detection, and Mutation Testing **Status**: Chaos Implemented in v1.2 | **Target**: Flake + Mutation in v1.3 | **Priority**: P0 --- ## 1. Executive Summary This plan adds three first-class quality assurance features to APOPHIS: 1. **Chaos Mode** — Inject controlled failures during contract execution to validate resilience guarantees 2. **Flake Detection** — Automatically rerun failing tests with varied seeds to identify non-deterministic contracts 3. **Mutation Testing** — Introduce synthetic bugs into route handlers and verify contracts catch them These features transform APOPHIS from a contract validator into a comprehensive API quality platform. They leverage APOPHIS's unique architecture: formal contract ASTs, seeded property-based generation, extension hooks, and programmatic route access. **Chaos Mode is implemented in v1.2.** Flake Detection and Mutation Testing are planned for v1.3 alongside Protocol Extensions (see `docs/protocol-extensions-spec.md`). --- ## 2. Design Principles ### 2.1 Environment Guardrails **All three features run ONLY in `NODE_ENV=test`.** This is non-negotiable. Implementation: ```typescript // src/quality/env-guard.ts (NEW FILE) export const assertTestEnv = (feature: string): void => { if (process.env.NODE_ENV !== 'test') { throw new Error( `${feature} is only available in test environment. ` + `Set NODE_ENV=test to enable quality features.` ) } } ``` Cited from: `src/extension/registry.ts:26-33` — `handleHookError` pattern for fatal vs warn severity. ### 2.2 Config Flags on `contract()` All features are opt-in via `TestConfig`: ```typescript // src/types.ts:218-223 (CURRENT) export interface TestConfig { readonly depth?: TestDepth readonly scope?: string readonly seed?: number readonly timeout?: number } ``` Extended to: ```typescript // src/types.ts:218-230 (PLANNED) export interface TestConfig { readonly depth?: TestDepth readonly scope?: string readonly seed?: number readonly timeout?: number readonly chaos?: ChaosConfig // Phase 1 readonly mutation?: MutationConfig // Phase 3 } ``` Flake does NOT appear in config — it is **automatic** on any test failure. ### 2.3 Red-Green-Refactor per Phase Each phase follows strict RGR: - **Red**: Write failing test that exercises the feature - **Green**: Implement minimal code to pass - **Refactor**: Extract patterns, deduplicate, add types Parallelization strategy: Phase 1 (Chaos) and test infrastructure can be developed in parallel. Phase 2 (Flake) depends on Phase 1's runner modifications. Phase 3 (Mutation) is independent after Phase 1. --- ## 3. Current Architecture Analysis ### 3.1 Test Runner Entry Points **File**: `src/plugin/index.ts:48-69` ```typescript const buildContract = (fastify, scope, extensionRegistry) => async (opts = {}) => { const config = { depth: opts.depth ?? 'standard', scope: opts.scope, seed: opts.seed, } const suite = await runPetitTests(injectInstance, config, scope, extensionRegistry) // ... empty discovery check ... return suite } ``` This is the primary entry point for all contract testing. It delegates to `runPetitTests`. ### 3.2 Core Runner Loop **File**: `src/test/petit-runner.ts:166-360` Key sections: - **Line 166**: `runPetitTests()` signature — accepts `TestConfig`, `ScopeRegistry`, `ExtensionRegistry` - **Line 176-178**: Extension suite start hooks - **Line 248-261**: Request building with extension hooks - **Line 263-274**: `runBeforeRequestHooks` — **Chaos injection point** - **Line 276-278**: `executeHttp()` call — **Chaos delay/error injection point** - **Line 282-290**: `runAfterRequestHooks` - **Line 301-306**: `validatePostconditions()` — **Flake rerun trigger point** - **Line 307-333**: Failure handling — **Flake auto-rerun entry** - **Line 361-362**: Cache flush ### 3.3 HTTP Execution **File**: `src/infrastructure/http-executor.ts:63-145` ```typescript export const executeHttp = async ( fastify: FastifyInjectInstance, route: RouteContract, request: RequestStructure, previous?: EvalContext, timeoutMs?: number ): Promise => { // Line 85: const startTime = Date.now() // Line 86: let timedOut = false // Line 103-116: Promise.race with timeout // Line 117-144: Error handling (including timeout context return) } ``` **Critical for Chaos**: The timeout mechanism (lines 104-113) uses `Promise.race`. Chaos delays must be injected BEFORE this race, or they must extend the timeout window. ### 3.4 Extension Hook System **File**: `src/extension/types.ts:145-174` ```typescript readonly onBuildRequest?: (context: RequestBuildContext) => RequestStructure | Promise | undefined readonly onBeforeRequest?: (context: ExecutionContext) => Promise readonly onAfterRequest?: (context: ExecutionContext) => Promise readonly onSuiteStart?: (config: TestConfig) => Promise | undefined> | Record | undefined readonly onSuiteEnd?: (suite: TestSuite, extensionState: Record) => Promise ``` **File**: `src/extension/registry.ts:268-294` ```typescript async runBuildRequestHooks(context: RequestBuildContext): Promise { let request = context.request for (const ext of this._buildRequestExts) { // Line 278: const result = await withTimeout(Promise.resolve(hook(hookContext)), ...) // Line 285: if (result !== undefined) request = result } return request } ``` ### 3.5 Result Types **File**: `src/types.ts:254-291` ```typescript export interface TestResult { readonly ok: boolean readonly name: string readonly id: number readonly directive?: string readonly diagnostics?: TestDiagnostics } export interface TestDiagnostics { readonly error?: string readonly violation?: ContractViolation readonly suggestion?: string readonly formula?: string readonly counterexample?: string } export interface TestSuite { readonly tests: ReadonlyArray readonly summary: TestSummary readonly routes: ReadonlyArray } ``` --- ## 4. Phase 1: Chaos Mode ### 4.1 Overview Chaos mode injects controlled failures during contract execution: - **Delays**: Add latency to requests (test timeout contracts) - **Errors**: Force specific HTTP status codes (test error-handling contracts) - **Dropouts**: Simulate network failures (test retry/timeout contracts) **Key insight**: Chaos is NOT an extension. It is a **runner mode** that wraps `executeHttp`. This distinction matters because: - Extensions are user-provided; Chaos is built-in - Extensions run in dependency order; Chaos must run at a specific point in the HTTP lifecycle - Extensions can be disabled via health checks; Chaos is controlled purely by config ### 4.2 Config Type **File**: `src/types.ts` (NEW SECTION after line 223) ```typescript // Line 224+ (NEW) export interface ChaosConfig { /** Probability of injecting any chaos event (0.0 - 1.0) */ readonly probability: number /** Delay injection: add artificial latency */ readonly delay?: { readonly probability: number // Conditional on chaos.probability readonly minMs: number readonly maxMs: number } /** Error injection: force HTTP error responses */ readonly error?: { readonly probability: number readonly statusCode: number // e.g., 503 readonly body?: unknown } /** Dropout injection: simulate network failure */ readonly dropout?: { readonly probability: number } } ``` ### 4.3 Chaos Engine Implementation **File**: `src/quality/chaos.ts` (NEW FILE) ```typescript /** * Chaos Engineering Engine for APOPHIS * * Injects controlled failures into the HTTP execution pipeline. * Uses a seeded RNG for reproducible chaos events. * * Architecture: Chaos is a runner concern, not an extension. * It wraps executeHttp at the call site in petit-runner.ts. */ import type { ChaosConfig, EvalContext, RouteContract } from '../types.js' import type { RequestStructure } from '../domain/request-builder.js' import { SeededRng } from '../infrastructure/seeded-rng.js' import { getErrorMessage } from '../infrastructure/security.js' export interface ChaosEvent { readonly type: 'delay' | 'error' | 'dropout' readonly injected: boolean readonly details?: { readonly delayMs?: number readonly statusCode?: number readonly reason: string } } export class ChaosEngine { private rng: SeededRng private events: ChaosEvent[] = [] private config: ChaosConfig constructor(config: ChaosConfig, seed?: number) { this.config = config // Derive chaos seed from test seed for reproducibility this.rng = new SeededRng(seed !== undefined ? seed + 0xCA05 : Date.now()) } /** * Wrap executeHttp with chaos injection. * Returns the original EvalContext or a modified one based on chaos events. */ async executeWithChaos( executeHttp: () => Promise, route: RouteContract, request: RequestStructure ): Promise<{ ctx: EvalContext; events: ChaosEvent[] }> { this.events = [] // Global probability gate if (!this.shouldInject(this.config.probability)) { const ctx = await executeHttp() return { ctx, events: this.events } } // Delay injection (before HTTP call) if (this.config.delay && this.shouldInject(this.config.delay.probability)) { const delayMs = this.randomDelay() this.events.push({ type: 'delay', injected: true, details: { delayMs, reason: `Chaos delay: ${delayMs}ms` }, }) await this.sleep(delayMs) } // Dropout injection (skip HTTP call entirely) if (this.config.dropout && this.shouldInject(this.config.dropout.probability)) { this.events.push({ type: 'dropout', injected: true, details: { reason: 'Chaos dropout: network failure simulated' }, }) return { ctx: this.buildDropoutContext(route, request), events: this.events, } } // Execute the actual HTTP call let ctx: EvalContext try { ctx = await executeHttp() } catch (err) { // Error injection: wrap actual errors in chaos context if (this.config.error && this.shouldInject(this.config.error.probability)) { this.events.push({ type: 'error', injected: true, details: { statusCode: this.config.error.statusCode, reason: `Chaos error: forced ${this.config.error.statusCode}`, }, }) return { ctx: this.buildErrorContext(route, request, this.config.error.statusCode, this.config.error.body), events: this.events, } } throw err } // Error injection: override successful responses if (this.config.error && this.shouldInject(this.config.error.probability)) { this.events.push({ type: 'error', injected: true, details: { statusCode: this.config.error.statusCode, reason: `Chaos error: overridden ${ctx.response.statusCode} with ${this.config.error.statusCode}`, }, }) return { ctx: this.buildErrorContext(route, request, this.config.error.statusCode, this.config.error.body, ctx), events: this.events, } } return { ctx, events: this.events } } private shouldInject(probability: number): boolean { return this.rng.next() < probability } private randomDelay(): number { if (!this.config.delay) return 0 const min = this.config.delay.minMs const max = this.config.delay.maxMs return min + Math.floor(this.rng.next() * (max - min + 1)) } private sleep(ms: number): Promise { return new Promise(resolve => setTimeout(resolve, ms)) } private buildDropoutContext(route: RouteContract, request: RequestStructure): EvalContext { return { request: { body: request.body, headers: request.headers, query: request.query || {}, params: {}, multipart: request.multipart, }, response: { body: { error: 'Chaos dropout: network failure simulated' }, headers: {}, statusCode: 0, responseTime: 0, }, previous: undefined, timedOut: false, timeoutMs: undefined, redirects: [], } } private buildErrorContext( route: RouteContract, request: RequestStructure, statusCode: number, body?: unknown, originalCtx?: EvalContext ): EvalContext { return { request: { body: request.body, headers: request.headers, query: request.query || {}, params: originalCtx?.request.params ?? {}, multipart: request.multipart, }, response: { body: body ?? { error: `Chaos error: forced ${statusCode}` }, headers: originalCtx?.response.headers ?? {}, statusCode, responseTime: originalCtx?.response.responseTime ?? 0, }, previous: originalCtx?.previous, timedOut: false, timeoutMs: originalCtx?.timeoutMs, redirects: originalCtx?.redirects ?? [], } } } ``` ### 4.4 Runner Integration **File**: `src/test/petit-runner.ts` (MODIFICATIONS) **Line 30** (after existing imports): ```typescript import { ChaosEngine } from '../quality/chaos.js' import { assertTestEnv } from '../quality/env-guard.js' ``` **Line 276-278** (CURRENT): ```typescript const timeoutMs = command.route.timeout ?? config.timeout ctx = await executeHttp(fastify, command.route, request, previousCtx, timeoutMs) ``` **Line 276-290** (MODIFIED): ```typescript const timeoutMs = command.route.timeout ?? config.timeout // Phase 1: Chaos Mode let chaosEvents: ChaosEvent[] = [] if (config.chaos) { assertTestEnv('Chaos mode') const chaos = new ChaosEngine(config.chaos, config.seed) const result = await chaos.executeWithChaos( () => executeHttp(fastify, command.route, request, previousCtx, timeoutMs), command.route, request ) ctx = result.ctx chaosEvents = result.events } else { ctx = await executeHttp(fastify, command.route, request, previousCtx, timeoutMs) } ``` **Line 307-333** (failure handling, ADD chaosEvents to diagnostics): ```typescript if (!post.success) { const diagnostics: Record = { statusCode: ctx.response.statusCode, error: post.error, } // Phase 1: Include chaos events in failure diagnostics if (chaosEvents.length > 0) { diagnostics.chaosEvents = chaosEvents } if (post.violation) { // ... existing violation handling ... } results.push({ ok: false, name, id: testId, diagnostics, }) } ``` ### 4.5 Plugin Integration **File**: `src/plugin/index.ts:48-55` (MODIFIED) ```typescript const buildContract = (fastify, scope, extensionRegistry) => async (opts = {}) => { const config = { depth: opts.depth ?? 'standard', scope: opts.scope, seed: opts.seed, chaos: opts.chaos, // Phase 1: Pass through chaos config } // ... rest unchanged ... } ``` ### 4.6 Test Plan **File**: `src/test/chaos.test.ts` (NEW FILE) Test cases (red-green-refactor): ```typescript // RED (fails without implementation) test('chaos: delay injection adds latency', async () => { const fastify = await withApophisApp() // Setup route with timeout contract // Run with chaos.delay // Assert timeout_occurred(this) == true passes }) // GREEN (minimal implementation) // Implement ChaosEngine.sleep and delay injection // RED test('chaos: error injection forces status code', async () => { // Setup route with contract status:200 // Run with chaos.error: { statusCode: 503 } // Assert contract fails with status 503 }) // GREEN // Implement error injection in ChaosEngine // RED test('chaos: dropout returns status 0', async () => { // Run with chaos.dropout // Assert statusCode is 0 }) // GREEN // Implement dropout in ChaosEngine // RED test('chaos: respects probability gate', async () => { // Run 100 times with probability 0 // Assert no chaos events }) // GREEN // Implement probability gate // REFACTOR // Extract shared test setup, add property-based tests for probability distribution ``` ### 4.7 Parallelization Chaos Phase can be parallelized with: - **Test infrastructure setup** (`src/test/chaos.test.ts` scaffold) - **Documentation** (`docs/chaos.md`) - **Type definitions** (`src/types.ts` updates) --- ## 5. Phase 2: Flake Detection ### 5.1 Overview Flake detection automatically reruns failing tests with varied seeds to confirm the failure is deterministic. It also detects "shrinks that go green" — where a property test finds a counterexample but the simplified version passes. **Key insight**: Flake is NOT a config flag. It is an **automatic behavior** triggered by: 1. Any test result with `ok: false` 2. Any contract violation This makes it impossible to accidentally ship a flaky contract. ### 5.2 Flake Rerun Strategy When a test fails: 1. **Immediate rerun**: Run the exact same command with the same seed - If it passes → **FLAKE DETECTED** (likely order-dependent or time-dependent) - If it fails → proceed to step 2 2. **Seed variation**: Run with `seed + 1`, `seed + 2`, ..., `seed + N` - If any pass → **FLAKE DETECTED** (seed-dependent non-determinism) - If all fail → **CONFIRMED FAILURE** (deterministic bug) **Default**: `N = 3` reruns (4 total runs including original) ### 5.3 Flake Engine Implementation **File**: `src/quality/flake.ts` (NEW FILE) ```typescript /** * Flake Detection Engine for APOPHIS * * Automatically reruns failing tests with varied seeds to detect * non-deterministic contracts. Flake detection is automatic — no config required. * * Triggered by: any test result with ok: false * Strategy: same-seed rerun + seed-variation runs */ import type { TestResult, TestConfig, EvalContext, RouteContract } from '../types.js' import { assertTestEnv } from './env-guard.js' export interface FlakeReport { readonly originalResult: TestResult readonly reruns: FlakeRerun[] readonly isFlaky: boolean readonly confidence: 'high' | 'medium' | 'low' } export interface FlakeRerun { readonly seed: number readonly passed: boolean readonly ctx?: EvalContext } export interface FlakeOptions { /** Number of additional seeds to try (default: 3) */ readonly seedVariations?: number /** Number of same-seed reruns (default: 1) */ readonly sameSeedReruns?: number } const DEFAULT_OPTIONS: Required = { seedVariations: 3, sameSeedReruns: 1, } export class FlakeDetector { private options: Required constructor(options: FlakeOptions = {}) { this.options = { ...DEFAULT_OPTIONS, ...options } } /** * Analyze a failing test by rerunning it. * Returns a FlakeReport indicating whether the failure is deterministic. */ async detectFlake( originalResult: TestResult, rerunFn: (seed?: number) => Promise<{ passed: boolean; ctx?: EvalContext }>, originalSeed?: number ): Promise { assertTestEnv('Flake detection') const reruns: FlakeRerun[] = [] let isFlaky = false // Same-seed reruns for (let i = 0; i < this.options.sameSeedReruns; i++) { const result = await rerunFn(originalSeed) reruns.push({ seed: originalSeed ?? 0, passed: result.passed, ctx: result.ctx }) if (result.passed) { isFlaky = true } } // Seed-variation reruns const baseSeed = originalSeed ?? Date.now() for (let i = 1; i <= this.options.seedVariations; i++) { const variedSeed = baseSeed + i const result = await rerunFn(variedSeed) reruns.push({ seed: variedSeed, passed: result.passed, ctx: result.ctx }) if (result.passed) { isFlaky = true } } // Confidence scoring const passCount = reruns.filter(r => r.passed).length const confidence = passCount === 0 ? 'high' : passCount >= reruns.length / 2 ? 'low' : 'medium' return { originalResult, reruns, isFlaky, confidence, } } } ``` ### 5.4 Runner Integration **File**: `src/test/petit-runner.ts` (MODIFICATIONS) **Line 307-333** (failure handling, CURRENT): ```typescript if (!post.success) { const diagnostics: Record = { statusCode: ctx.response.statusCode, error: post.error, } // ... violation handling ... results.push({ ok: false, name, id: testId, diagnostics }) } ``` **Line 307-360** (MODIFIED with Flake): ```typescript if (!post.success) { // Phase 2: Flake Detection — rerun failing test const flakeReport = await this.detectFlake( { ok: false, name, id: testId, diagnostics: { error: post.error } }, async (seed) => { // Rebuild request with new seed const rerunRequest = buildRequest(command.route, command.params, scopeHeaders, state, seed !== undefined ? new SeededRng(seed) : undefined) const rerunCtx = await executeHttp(fastify, command.route, rerunRequest, previousCtx, timeoutMs) const rerunPost = validatePostconditions(command.route.ensures, rerunCtx, command.route, extensionRegistry) return { passed: rerunPost.success, ctx: rerunCtx } }, config.seed ) const diagnostics: Record = { statusCode: ctx.response.statusCode, error: post.error, } // Include flake report if flaky if (flakeReport.isFlaky) { diagnostics.flake = { isFlaky: true, confidence: flakeReport.confidence, reruns: flakeReport.reruns.map(r => ({ seed: r.seed, passed: r.passed, statusCode: r.ctx?.response.statusCode, })), suggestion: 'This contract failure is non-deterministic. Check for: time-dependent values, uninitialized state, race conditions, or external service calls.', } } // ... existing violation handling ... results.push({ ok: false, name: flakeReport.isFlaky ? `${name} [FLAKY]` : name, id: testId, diagnostics, }) } ``` ### 5.5 Plugin Integration No plugin changes needed. Flake is fully automatic. ### 5.6 Test Plan **File**: `src/test/flake.test.ts` (NEW FILE) ```typescript // RED test('flake: detects time-dependent contract', async () => { // Route that returns { timestamp: Date.now() } // Contract: response_body(this).timestamp > 0 // First run passes, rerun might get different timestamp // Assert flake report shows isFlaky: true }) // GREEN // Implement basic FlakeDetector // RED test('flake: confirms deterministic failure', async () => { // Route that always returns 500 // Contract: status:200 // Assert all reruns fail, isFlaky: false }) // GREEN // Implement rerun logic // RED test('flake: varies seed for reruns', async () => { // Route with seeded random response // Assert reruns use different seeds }) // GREEN // Implement seed variation // REFACTOR // Extract shared setup, add tests for confidence scoring ``` ### 5.7 Parallelization Flake Phase can be developed in parallel with: - Chaos Phase (if runner modifications are coordinated) - Mutation Phase (type definitions) - Documentation --- ## 6. Phase 3: Mutation Testing ### 6.1 Overview Mutation testing introduces synthetic bugs into route handlers and verifies that contracts catch them. It answers: "Are my contracts strong enough to detect real bugs?" **Key insight**: Mutation is an ASSESSMENT tool, not a test mode. It runs AFTER normal contract testing and reports a score. It requires AST parsing of route handlers. ### 6.2 Config Type **File**: `src/types.ts` (EXTENDED from Phase 1) ```typescript export interface MutationConfig { /** Mutators to apply */ readonly mutators?: MutationType[] /** Max mutations per route (default: 5) */ readonly maxPerRoute?: number /** Stop after first surviving mutation per route (default: false) */ readonly stopOnSurvivor?: boolean } export type MutationType = | 'status-code' // Change status(201) to status(200) | 'operator-swap' // Change > to <, === to !== | 'field-delete' // Remove field from return object | 'null-response' // Return null instead of object | 'boolean-flip' // Flip boolean values | 'string-corrupt' // Corrupt string responses export interface MutationReport { readonly score: number // 0.0 - 1.0 (percentage killed) readonly total: number readonly killed: number readonly survived: number readonly mutants: MutantResult[] readonly weakRoutes: string[] // Routes with <50% kill rate } export interface MutantResult { readonly route: string readonly method: string readonly type: MutationType readonly description: string readonly killed: boolean readonly killedBy?: string // Formula that caught it readonly error?: string } ``` ### 6.3 Mutation Engine Implementation **File**: `src/quality/mutation.ts` (NEW FILE) ```typescript /** * Mutation Testing Engine for APOPHIS * * Introduces synthetic bugs into route handlers and verifies * that x-ensures contracts catch them. * * Architecture: Mutator transforms handler AST, runner executes * contracts against mutated handler, original is restored. */ import type { FastifyInstance } from 'fastify' import type { MutationConfig, MutationType, MutationReport, MutantResult, TestConfig, TestSuite, RouteContract } from '../types.js' import { assertTestEnv } from './env-guard.js' import { runPetitTests } from '../test/petit-runner.js' export class MutationEngine { private config: Required private mutants: MutantResult[] = [] constructor(config: MutationConfig = {}) { this.config = { mutators: config.mutators ?? ['status-code', 'operator-swap', 'field-delete'], maxPerRoute: config.maxPerRoute ?? 5, stopOnSurvivor: config.stopOnSurvivor ?? false, } } async run( fastify: FastifyInstance, routes: RouteContract[], baseConfig: TestConfig ): Promise { assertTestEnv('Mutation testing') this.mutants = [] for (const route of routes) { await this.mutateRoute(fastify, route, baseConfig) } const killed = this.mutants.filter(m => m.killed).length const total = this.mutants.length return { score: total === 0 ? 1.0 : killed / total, total, killed, survived: total - killed, mutants: this.mutants, weakRoutes: this.findWeakRoutes(), } } private async mutateRoute( fastify: FastifyInstance, route: RouteContract, baseConfig: TestConfig ): Promise { const handler = this.extractHandler(fastify, route) if (!handler) return let mutationCount = 0 for (const mutatorType of this.config.mutators) { if (mutationCount >= this.config.maxPerRoute) break const mutant = await this.applyMutation(fastify, route, handler, mutatorType, baseConfig) if (mutant) { this.mutants.push(mutant) mutationCount++ if (this.config.stopOnSurvivor && !mutant.killed) { break // Stop after first survivor for this route } } } } private extractHandler(fastify: FastifyInstance, route: RouteContract): Function | null { // Access Fastify's internal route handler // Fastify stores routes in fastify[Symbol.for('fastify.route')] // This is implementation-dependent and may require reflection const routes = (fastify as unknown as { [key: symbol]: unknown })[Symbol.for('fastify.routes')] // ... handler extraction logic ... return null // Placeholder } private async applyMutation( fastify: FastifyInstance, route: RouteContract, originalHandler: Function, type: MutationType, baseConfig: TestConfig ): Promise { // Create mutated handler const mutatedHandler = this.createMutant(originalHandler, type) if (!mutatedHandler) return null // Replace handler temporarily this.replaceHandler(fastify, route, mutatedHandler) try { // Run contract test on mutated handler const suite = await runPetitTests( fastify as unknown as import('../types.js').FastifyInjectInstance, { ...baseConfig, depth: 'quick' }, // Quick depth for speed undefined, undefined ) const failed = suite.tests.some(t => !t.ok) return { route: route.path, method: route.method, type, description: this.describeMutation(type), killed: failed, killedBy: failed ? this.findKillingFormula(suite) : undefined, } } finally { // Restore original handler this.replaceHandler(fastify, route, originalHandler) } } private createMutant(handler: Function, type: MutationType): Function | null { // AST-based mutation using meriyah or acorn // This is the complex part — parse handler, mutate AST, regenerate switch (type) { case 'status-code': return this.mutateStatusCode(handler) case 'operator-swap': return this.mutateOperators(handler) case 'field-delete': return this.mutateFieldDelete(handler) // ... etc } return null } private mutateStatusCode(handler: Function): Function | null { // Parse handler to string const source = handler.toString() // Find status(code) calls and decrement by 1 const mutated = source.replace(/status\((\d+)\)/g, (_, code) => `status(${Number(code) - 1})`) if (mutated === source) return null // Evaluate mutated function return new Function('return ' + mutated)() } private describeMutation(type: MutationType): string { const descriptions: Record = { 'status-code': 'Decremented HTTP status code by 1', 'operator-swap': 'Swapped comparison operator', 'field-delete': 'Removed field from response object', 'null-response': 'Replaced response with null', 'boolean-flip': 'Flipped boolean value', 'string-corrupt': 'Corrupted string response', } return descriptions[type] } private findKillingFormula(suite: TestSuite): string | undefined { const failure = suite.tests.find(t => !t.ok && t.diagnostics?.formula) return failure?.diagnostics?.formula } private findWeakRoutes(): string[] { const routeScores = new Map() for (const mutant of this.mutants) { const key = `${mutant.method} ${mutant.route}` const current = routeScores.get(key) ?? { killed: 0, total: 0 } current.total++ if (mutant.killed) current.killed++ routeScores.set(key, current) } return Array.from(routeScores.entries()) .filter(([_, scores]) => scores.killed / scores.total < 0.5) .map(([route]) => route) } private replaceHandler(fastify: FastifyInstance, route: RouteContract, handler: Function): void { // Fastify handler replacement logic // This requires internal Fastify APIs and is implementation-dependent } } ``` ### 6.4 Plugin Integration **File**: `src/plugin/index.ts` (NEW METHOD) **Line 130+** (after existing methods): ```typescript const buildMutate = (fastify: FastifyInstance, scope: ScopeRegistry, extensionRegistry: ExtensionRegistry) => async (opts: TestConfig & { mutation?: MutationConfig } = {}): Promise => { assertTestEnv('Mutation testing') const config = { depth: opts.depth ?? 'standard', scope: opts.scope, seed: opts.seed, } const routes = discoverRoutes(fastify as unknown as { routes?: Array<{ method: string; url: string; schema?: Record }> }) .filter(r => r.category !== 'utility') const engine = new MutationEngine(opts.mutation) return engine.run(fastify, routes, config) } ``` **Line 160+** (attach to fastify.apophis): ```typescript fastify.decorate('apophis', { contract: buildContract(fastify, scopeRegistry, extensionRegistry), stateful: buildStateful(fastify, scopeRegistry, cleanupManager, extensionRegistry), check: buildCheck(fastify, scopeRegistry, extensionRegistry), cleanup: buildCleanup(cleanupManager), spec: buildSpec(fastify), scope: scopeRegistry, mutate: buildMutate(fastify, scopeRegistry, extensionRegistry), // Phase 3 }) ``` ### 6.5 Test Plan **File**: `src/test/mutation.test.ts` (NEW FILE) ```typescript // RED test('mutation: kills status code mutation', async () => { // Route returns status 201 // Contract: status:201 // Mutate status to 200 // Assert mutation is killed (contract fails) }) // GREEN // Implement basic status-code mutator // RED test('mutation: survives weak contract', async () => { // Route returns { id: '123' } // Contract: status:201 (doesn't check body) // Mutate to return { id: null } // Assert mutation survives (contract passes) }) // GREEN // Implement field-delete mutator // RED test('mutation: reports weak routes', async () => { // Route with no body checks // Assert weakRoutes includes the route }) // GREEN // Implement weak route detection // REFACTOR // Extract AST parsing utilities, add operator-swap mutator ``` ### 6.6 AST Parsing Strategy Mutation requires parsing route handler source code. Options: 1. **meriyah** (recommended): ES2020 parser, lightweight, no dependencies 2. **acorn**: Mature, plugin ecosystem 3. **Babel**: Heavy but powerful **Implementation** (`src/quality/ast-mutator.ts`): ```typescript import { parseScript } from 'meriyah' import { generate } from 'astring' export class AstMutator { mutate(source: string, type: MutationType): string | null { const ast = parseScript(source, { module: true }) // Walk AST, apply mutation // Return regenerated source return generate(ast) } } ``` ### 6.7 Parallelization Mutation Phase can be parallelized with: - Documentation - AST parser selection and setup - CLI reporter design --- ## 7. File Structure ``` src/ quality/ # NEW DIRECTORY env-guard.ts # Environment assertions chaos.ts # Phase 1: ChaosEngine flake.ts # Phase 2: FlakeDetector mutation.ts # Phase 3: MutationEngine ast-mutator.ts # Phase 3: AST parsing index.ts # Public exports test/ chaos.test.ts # Phase 1 tests flake.test.ts # Phase 2 tests mutation.test.ts # Phase 3 tests helpers.ts # Shared test utilities (exists) types.ts # Extended TestConfig plugin/index.ts # New .mutate() method test/petit-runner.ts # Chaos + Flake integration ``` --- ## 8. Test Strategy ### 8.1 Unit Tests (per module) **Chaos Engine** (`src/test/chaos.test.ts`): - Probability gate (deterministic with seed) - Delay injection timing - Error injection overrides - Dropout context structure - Event recording **Flake Detector** (`src/test/flake.test.ts`): - Same-seed rerun detection - Seed variation logic - Confidence scoring - Integration with runner diagnostics **Mutation Engine** (`src/test/mutation.test.ts`): - Status code mutation - Operator swap mutation - Field deletion mutation - Handler restoration - Score calculation ### 8.2 Integration Tests **File**: `src/test/quality-integration.test.ts` (NEW) ```typescript test('quality: chaos + flake combo', async () => { // Run contract with chaos // Failing test triggers flake detection // Assert flake report includes chaos events }) test('quality: mutation score reflects contract strength', async () => { // Route with strong contracts (checks body + status) // Run mutation // Assert score > 0.8 // Route with weak contracts (status only) // Run mutation // Assert score < 0.5 }) ``` ### 8.3 Regression Tests All existing 482 tests must pass unchanged. Quality features are: - Opt-in via config (Chaos, Mutation) - Automatic in test env only (Flake) - Backward compatible: `TestConfig` gains optional fields --- ## 9. Implementation Schedule ### v1.2: Chaos (Phase 1) — COMPLETED - `src/types.ts`: Add `ChaosConfig` ✅ - `src/quality/env-guard.ts`: Environment assertions ✅ - `src/quality/chaos.ts`: `ChaosEngine` class ✅ - `src/quality/corruption.ts`: Content-type aware corruption ✅ - `src/test/petit-runner.ts`: Wrap `executeHttp` with Chaos ✅ - `src/plugin/index.ts`: Pass `chaos` config through ✅ - `src/test/chaos.test.ts`: 21 tests ✅ - Documentation: `docs/chaos.md` ✅ ### v1.3: Flake (Phase 2) — PLANNED **Day 1-2: Core** - `src/quality/flake.ts`: `FlakeDetector` class (2 hours) **Day 3: Runner Integration** - `src/test/petit-runner.ts`: Auto-rerun on failure (1.5 hours) **Day 4: Tests** - `src/test/flake.test.ts`: Full test suite (2 hours) **Day 5: Refactor** - Performance optimization (parallel reruns) - Documentation: `docs/flake.md` ### v1.3: Mutation (Phase 3) — PLANNED **Day 1-2: AST Infrastructure** - `src/quality/ast-mutator.ts`: Parser + mutators (3 hours) - `src/quality/mutation.ts`: `MutationEngine` (2 hours) **Day 3: Plugin Integration** - `src/plugin/index.ts`: `.mutate()` method (1 hour) **Day 4: Tests** - `src/test/mutation.test.ts`: Full test suite (3 hours) **Day 5: Integration + Polish** - `src/test/quality-integration.test.ts` (1 hour) - Documentation: `docs/mutation.md` ### Parallel Workstreams - **Protocol Extensions**: JWT, Time Control, Stateful predicates (see `docs/protocol-extensions-spec.md`) - **Documentation**: Can be written alongside implementation - **CLI reporter**: Enhance error output for quality features - **Performance**: Benchmark mutation runs (target: <5s per route) --- ## 10. Risk Analysis ### 10.1 Technical Risks | Risk | Probability | Impact | Mitigation | |------|------------|--------|------------| | AST parsing breaks on complex handlers | Medium | High | Start with simple mutators, fallback to regex | | Fastify internal API changes | Low | High | Use public APIs where possible, version pin | | Flake reruns slow CI significantly | Medium | Medium | Make reruns configurable, default to 1 | | Chaos delays cause false timeouts | Medium | Medium | Adjust timeout calculation for delays | ### 10.2 Design Decisions **Decision**: Chaos as runner mode vs extension - **Chosen**: Runner mode (direct `executeHttp` wrapper) - **Rationale**: More control over injection timing, no dependency on extension health checks **Decision**: Flake auto-run vs config flag - **Chosen**: Auto-run on failures - **Rationale**: Zero-config, prevents shipping flaky contracts **Decision**: Mutation as separate method vs config flag - **Chosen**: Separate `fastify.apophis.mutate()` method - **Rationale**: Different mental model (assessment vs testing), different result type --- ## 11. Acceptance Criteria ### 11.1 Chaos - [ ] `contract({ chaos: { probability: 0.1, delay: { minMs: 100, maxMs: 500 } } })` runs successfully - [ ] Chaos events appear in test diagnostics - [ ] Delays, errors, and dropouts all inject correctly - [ ] Seeded RNG makes chaos reproducible - [ ] Only runs in `NODE_ENV=test` ### 11.2 Flake - [ ] Failing tests automatically rerun with same seed - [ ] Seed variations detect non-determinism - [ ] Flake report includes confidence score - [ ] No config required (always on in test) - [ ] Does not affect passing tests ### 11.3 Mutation - [ ] `mutate()` returns score 0.0-1.0 - [ ] Status-code mutator works on simple handlers - [ ] Field-delete mutator removes response fields - [ ] Original handlers are restored after mutation - [ ] Weak routes identified correctly - [ ] Only runs in `NODE_ENV=test` --- ## 12. Documentation Plan ### 12.1 User Documentation **File**: `docs/chaos.md` - Chaos mode overview - Config examples - Interpreting chaos events - Best practices (gradual probability increase) **File**: `docs/flake.md` - Flake detection overview - Understanding flake reports - Fixing flaky contracts **File**: `docs/mutation.md` - Mutation testing overview - Interpreting mutation scores - Writing contracts that catch bugs - Weak route remediation ### 12.2 API Documentation Update `README.md`: - Add Chaos section with example - Add Flake section (auto-run behavior) - Add Mutation section with score example --- ## 13. Success Metrics | Metric | Target | Measurement | |--------|--------|-------------| | Chaos injection accuracy | >95% | Unit tests | | Flake detection rate | >90% | Synthetic flaky tests | | Mutation kill rate (example API) | >80% | Demo project | | Test suite runtime (with flake) | <2x baseline | Benchmark | | Code coverage (quality/) | >90% | npx c8 | --- ## 14. References ### Codebase Citations 1. **TestConfig**: `src/types.ts:218-223` 2. **TestResult/TestSuite**: `src/types.ts:254-291` 3. **runPetitTests**: `src/test/petit-runner.ts:166-428` 4. **executeHttp**: `src/infrastructure/http-executor.ts:63-145` 5. **Extension hooks**: `src/extension/types.ts:145-174` 6. **Extension registry**: `src/extension/registry.ts:268-294` 7. **Plugin entry**: `src/plugin/index.ts:48-69` 8. **Environment**: `process.env.NODE_ENV` (standard Node.js) ### External Dependencies (Potential) - `meriyah`: ES2020 parser for AST mutation - `astring`: AST to code generator - `acorn`: Alternative parser (if meriyah insufficient) --- ## 15. Approval Checklist - [ ] Architecture reviewed - [ ] File structure agreed - [ ] API surface approved (config vs methods) - [ ] Test strategy accepted - [ ] Schedule realistic - [ ] Risk analysis complete - [ ] Documentation plan approved --- *Document Version: 1.0* *Author: APOPHIS Architecture Team* *Date: 2026-04-25*