# Arbiter → Apophis Feedback Report **Date:** 2026-04-27 **Reporter:** Arbiter Engineering Team **Context:** Integration of Apophis v2.2 into Arbiter Platform for behavioral contract testing --- ## Executive Summary Apophis provides genuinely valuable capabilities for behavioral contract testing that go beyond traditional unit/integration tests. The schema-to-contract inference, cross-operation verification, and chaos testing infrastructure are compelling. However, we encountered 3 bugs in core infrastructure and several design friction points that should be addressed for wider adoption. **Overall Assessment:** Strong value proposition for teams willing to invest in schema-driven testing. Needs polish on edge cases and configurability. --- ## Part 1: How Chaos Injection Would Help Arbiter ### Current State Arbiter is a multi-tenant SaaS platform with: - 500+ API endpoints across 15 route families - Billing, graph storage, auth, sessions, webhooks, etc. - Mock Stripe integration for payment processing - In-memory and persistent storage backends - Complex middleware chain: auth → tenant boundary → permissions → preflight → handler ### Where Chaos Testing Adds Value **1. Middleware Resilience Verification** Our middleware chain has implicit dependencies: ``` Transport → AuthN → Scope → AuthZ → Challenge → Preflight → Handler ``` Chaos testing would verify: - What happens when `preflight()` times out? Does the handler still execute? - If auth middleware fails with 503, do we get proper retry headers? - Does a slow tenant boundary check cascade to response timeouts? **Concrete scenario:** If the billing preflight gate (budget check) is slow, does the subscription creation handler wait or fail? Our contracts say `response_time < 2000ms` — chaos would tell us if that's actually enforced. **2. Mock Service Degradation** We use `MockStripeService` for payment processing. In production, Stripe can: - Return 429 (rate limit) - Time out on `paymentIntents.create` - Return network errors Chaos testing would inject: ``` if chaos:stripe-timeout then response_code == 503 if chaos:stripe-rate-limit then retry-after header != null ``` This validates our fallback logic — currently untested because mocks always succeed. **3. Resource Leak Detection** Our `BillingApplicationService` uses in-memory Maps. Chaos scenarios: - Create 1000 plans, delete 500, verify GET on deleted returns 404 - Cancel subscriptions mid-renewal cycle - Concurrent PATCH operations on same plan Cross-operation contracts catch this for single requests, but chaos tests concurrent state corruption. **4. Entitlement Boundary Testing** We have credit-based preflight gates. Chaos could: - Exhaust credits mid-test - Verify 402 (Payment Required) is returned - Ensure no partial mutations occur when budget is depleted This is business-critical: we cannot bill customers for operations that fail. **5. Auth Token Expiry** JWT tokens expire. Chaos could: - Expire tokens between POST and follow-up GET - Verify 401 with proper `WWW-Authenticate` header - Test refresh token flow under load ### Proposed Chaos Scenarios for Arbiter ```yaml billing_chaos: - name: stripe-timeout target: POST /billing/invoices/:id/pay inject: { stripe_delay_ms: 5000 } expected: { status: 503, retry_after: "> 0" } - name: storage-corruption target: DELETE /billing/plans/:id inject: { skip_deletion: true } expected: { status: 200, follow_up_get: 404 } - name: rate-limit target: POST /billing/plans inject: { rate_limit: 10 } expected: { status: 429, x_retry_after: "> 0" } - name: auth-expiry target: PATCH /billing/plans/:id inject: { expire_token_after_ms: 100 } expected: { status: 401, www_authenticate: "Bearer" } ``` --- ## Part 2: Bugs Found ### Bug 1: Scope Registry Ignores Configured Default Scope **Severity:** High (breaks auth in cross-operation tests) **File:** `dist/infrastructure/scope-registry.js` **Line:** 60, 76-77 **Problem:** ```javascript const scope = scopeName !== null ? this.scopes.get(scopeName) : undefined; const base = scope ?? this.defaultScope; // Always uses empty DEFAULT_SCOPE ``` When `getHeaders(null)` is called, it uses `this.defaultScope` which is initialized to `{ headers: {}, metadata: {} }` on line 60, ignoring any "default" scope passed in the constructor. **Impact:** Cross-operation requests (e.g., `response_code(GET /users/{id})`) don't inherit auth headers from the configured scope, causing 401 failures on protected routes. **Fix:** ```javascript const base = scope ?? this.scopes.get('default') ?? this.defaultScope; ``` **Reproduction:** ```javascript await app.register(apophis, { scopes: { default: { headers: { 'authorization': 'Bearer token' } } } }); // Cross-operation GET /users/123 gets 401 because auth header is not passed ``` ### Bug 2: Contract Builder Drops Routes Option **Severity:** High (route filtering doesn't work) **File:** `dist/plugin/contract-builder.js` **Line:** 8-15 **Problem:** ```javascript const config = { depth: opts.depth ?? 'standard', scope: opts.scope, seed: opts.seed, timeout: opts.timeout, chaos: opts.chaos, // Missing: routes: opts.routes }; ``` The `routes` option is documented but never passed to `runPetitTests`, causing all routes to be tested regardless of the `routes` filter. **Impact:** Tests run against all 500+ routes instead of the 4 specified, making debugging impossible and CI times explode. **Fix:** ```javascript const config = { depth: opts.depth ?? 'standard', scope: opts.scope, seed: opts.seed, timeout: opts.timeout, chaos: opts.chaos, routes: opts.routes, // Add this }; ``` **Reproduction:** ```javascript await app.apophis.contract({ routes: ['POST /billing/plans'] // Tests ALL routes instead }); ``` ### Bug 3: Invariant Checking Not Configurable **Severity:** Medium (false failures for non-hierarchical APIs) **File:** `dist/test/petit-runner.js` **Line:** 386-398 **Problem:** Built-in invariants (`no-orphaned-resources`, `parent-reference-integrity`, `resource-integrity`) run unconditionally for all routes. These assume parent-child resource hierarchies (e.g., `/workspaces/:id/projects/:id`). **Impact:** For flat resource models (like our billing plans), routes with `x-category: 'constructor'` trigger invariant failures because resources don't have `parentType`/`parentId`. **Workaround:** We set `x-category: 'observer'` to avoid resource tracking, but this loses the semantic meaning of the route. **Suggested Fix:** ```javascript // In config invariants: ['resource-integrity'] // Opt-in per test // Or invariants: false // Disable all // Or per-route schema: { 'x-invariants': ['custom-only'] } ``` --- ## Part 3: Design Feedback ### 1. Schema Inference is Too Aggressive **Issue:** `const` values in JSON Schema generate unconditional contracts. Example: ```json { "response": { "200": { "properties": { "fragment_type": { "const": "Action" } } } } } ``` Generates: `response_body(this).fragment_type == "Action"` (checked for ALL responses) This fails when the route returns 404 with `fragment_type: "Error"`. **Suggestion:** Infer conditional contracts based on status code: ``` if status:200 then response_body(this).fragment_type == "Action" else true ``` Or add an option to disable inference: `inferContracts: false`. ### 2. Cross-Operation Headers Not Documented The `scope.headers` behavior for cross-operation requests is not documented. We had to read source code to discover that: - `createOperationResolver(fastify, request.headers)` passes request headers - But `request.headers` comes from `scope.getHeaders(null)` - Which had bug #1 above **Suggestion:** Document that cross-operation requests inherit the scope headers of the original request. ### 3. Missing 400 Response Handling When Fastify schema validation fails (e.g., enum mismatch), it returns 400 with a validation error object. Apophis treats this as a contract failure unless: - The schema has a 400 response documented - The contract explicitly accepts 400 Most developers won't document 400 responses. Apophis should either: - Auto-generate 400 contracts from validation rules - Or provide a global 400 handler pattern ### 4. HEAD Routes Cause Noise Fastify auto-generates HEAD routes for every GET. These have no response body, causing `response_body(this).id != null` failures. **Suggestion:** Auto-skip HEAD routes in contract tests, or provide `skipMethods: ['HEAD']` option. ### 5. Error Suggestions Need Context When a contract fails, the error is: ``` Field 'fragment_type' does not match expected value 'Error'. ``` But it doesn't say: - What the actual status code was - What the actual response body was - Which route generated the request **Suggestion:** Include actual vs expected in violation objects. --- ## Part 4: What We Love ### 1. Cross-Operation Contracts ``` if status:201 then response_code(GET /billing/plans/{response_body(this).data.plan_id}) == 200 else true ``` This is genuinely hard to test manually. Apophis makes it declarative and automatic. ### 2. Property-Based Generation Fast-check found edge cases we missed: - Empty string `name` (schema allowed it, service rejected it) - Invalid `billing_interval` values - Missing required fields ### 3. Schema as Single Source of Truth Once schemas are correct, contracts are free. The `x-ensures` array supplements rather than replaces schema validation. ### 4. Fast Feedback Loop Contract tests run in ~1.5s for 4 routes. Much faster than spinning up a full test environment. --- ## Part 5: Feature Requests ### 1. Hypermedia Contract Support Arbiter returns LDF (Linked Data Fragment) responses with `controls` and `actions`. We'd love to verify: ``` if status:200 then response_body(this).controls.self == request_url(this) else true if status:200 then response_body(this).actions.create.method == "POST" else true if status:200 then response_body(this).actions.update.target == "/billing/plans/{response_body(this).data.id}" else true ``` Currently we have to write these manually. Could Apophis infer hypermedia controls from route registration? ### 2. Conditional Schema Contracts Instead of removing `const` from schemas, allow: ```json { "response": { "200": { "properties": { "fragment_type": { "const": "Action", "x-apophis-conditional": "status:200" } } } } } ``` This preserves schema expressiveness while generating correct contracts. ### 3. Middleware Contract Verification Our middleware chain is critical. We'd like to verify: ``` if request_headers(this).authorization == null then status:401 else true if request_headers(this).x-tenant-id == null then status:400 else true ``` Apophis already supports `request_headers` — making this a first-class feature (e.g., `x-requires`) would be powerful. ### 4. State Cleanup Hooks After destructive tests (DELETE), we need to clean up: ```javascript await app.apophis.contract({ routes: ['DELETE /billing/plans/:id'], cleanup: async (state) => { // Remove created plans from database await db.plans.deleteMany({ id: { $in: state.createdPlans } }); } }); ``` This would enable stateful testing without polluting the test environment. ### 5. Contract Coverage Report After running tests, we'd like: ``` Contract Coverage: POST /billing/plans: - 201 response: ✓ tested (42 cases) - 400 response: ✓ tested (8 cases) - 503 response: ✗ not tested - Cross-op GET: ✓ tested (42 cases) ``` This helps identify gaps in contract coverage. --- ## Conclusion Apophis is a powerful tool that fills a gap in API testing — behavioral contracts and chaos testing. The core concepts are solid, but the implementation needs hardening for production use: **Must-fix:** Bugs #1 and #2 (scope registry, route filtering) **Should-fix:** Bug #3 (configurable invariants), inference aggressiveness **Nice-to-have:** Hypermedia support, middleware contracts, coverage reports We're committed to using Apophis for Arbiter's contract testing and will contribute fixes upstream. The value of cross-operation verification alone justifies the investment. --- **Contact:** Arbiter Engineering Team **Repository:** https://github.com/anomalyco/apophis (we'll open issues for each bug) # Critical Feedback: Why Current Chaos Injection is Insufficient for Production APIs **To:** Apophis Engineering Team **From:** Arbiter Platform Engineering **Date:** 2026-04-27 **Context:** Production SaaS platform with 500+ endpoints, Stripe integration, complex middleware chains --- ## The Core Problem Current chaos injection operates exclusively at the **HTTP transport layer** (`executeHttp()` wrapper). This tests: - ✅ Response schemas under forced errors - ✅ Timeout contracts with artificial delays - ✅ Response validation with corrupted bodies But **production APIs fail at the dependency layer**, not the transport layer: - Stripe API returns 429 rate limit - Database connection pool exhausted - Redis cache timeout - Third-party webhook delivery fails - Message queue backlog **Current chaos cannot simulate these.** It can force a 503 response, but it cannot simulate "Stripe returned 429, so we need to propagate retry-after header" because the handler never sees the Stripe error. --- ## Specific Pain Points ### 1. Error Injection is Backwards **Current behavior:** ``` Handler runs → creates side effects → response overridden to 503 ``` **What we need:** ``` Handler runs → Stripe call fails with 429 → handler catches error → returns 503 with retry-after ``` The current approach tests "what does our 503 response look like" but not "does our handler correctly handle Stripe errors." These are different: - Current: Tests schema compliance for hardcoded error responses - Needed: Tests business logic for dependency failures **Impact:** We have 503 contracts that pass, but our handler might not actually set the retry-after header when Stripe fails. The contract gives false confidence. ### 2. Chaos Events Are Invisible When chaos injects, the test result shows: ``` POST /billing/plans (#1): FAIL Error: Contract violation: if status:503 then response_body(this).data.error != null else true ``` But there's no indication that: - Chaos was the cause (not a real bug) - What type of chaos was injected (error? corruption? delay?) - What the original response was before override **Impact:** Debugging chaos failures is impossible. We can't tell if our contract is wrong or if chaos mutated the response unexpectedly. ### 3. Resilience Verification is Dangerous for Stateful APIs When `resilience: { enabled: true }`, Apophis retries the same request up to `maxRetries` times. For `POST /billing/plans`: - Attempt 1: Creates plan A → gets 503 → retries - Attempt 2: Creates plan B → gets 503 → retries - Attempt 3: Creates plan C → gets 503 → retries - Attempt 4: Creates plan D → succeeds **Result: 4 plans created, 1 expected.** This pollutes state and makes follow-up tests (GET, PATCH, DELETE) behave unpredictably. **Impact:** Can't use resilience testing on stateful routes without idempotency. Most real APIs are stateful. ### 4. Dropout Returns Status Code 0 Network failures in production don't return status code 0. They: - Time out (status undefined, error "ETIMEDOUT") - Reset connection (error "ECONNRESET") - Return 503 from load balancer Status 0 is a browser-specific artifact. Node.js HTTP clients don't produce status 0. **Impact:** Contracts can't match status 0. We have to either: - Add `status:0` to all contracts (meaningless) - Or ignore dropout failures (makes dropout useless) --- ## What Would Make Chaos Useful for Arbiter ### Option A: Outbound Request Contracts (Preferred) Apophis intercepts outbound HTTP requests from the handler: ```javascript // In Apophis config chaos: { outbound: { 'api.stripe.com': { delay: { probability: 0.1, minMs: 1000, maxMs: 5000 }, error: { probability: 0.05, responses: [ { statusCode: 429, headers: { 'retry-after': '60' } }, { statusCode: 503, body: { error: 'stripe_unavailable' } } ] } } } } ``` **Benefits:** - Handler sees real dependency failures - Tests actual error handling logic - Side effects only occur when handler succeeds - No state pollution from retries ### Option B: Service Method Wrapping Apophis wraps methods on decorated services: ```javascript // Fastify decorator app.decorate('stripe', new StripeService()); // Apophis wraps it apophis.chaos.wrap(app.stripe, { 'paymentIntents.create': { delay: { probability: 0.1, ms: 5000 }, error: { probability: 0.05, throws: new StripeTimeoutError() } } }); ``` **Benefits:** - Works with any service pattern (HTTP, DB, queue) - Tests business logic directly - Minimal changes to existing code ### Option C: Event-Driven Chaos For async architectures: ```javascript chaos: { events: { 'webhook.received': { drop: { probability: 0.1 }, // Simulate webhook loss delay: { probability: 0.2, ms: 30000 } // Simulate queue delay } } } ``` --- ## Recommended Priority Order ### P0 (Critical): Fix Event Reporting Every chaos injection should be visible: ```javascript // In test results test.diagnostics.chaos = { injected: true, type: 'error', details: { statusCode: 503, originalStatusCode: 201, strategy: 'override' } } ``` Without this, chaos failures are indistinguishable from real bugs. ### P1 (High): Add Dependency-Aware Chaos Implement outbound request interception or service wrapping. Current HTTP-layer chaos is too superficial for production APIs. ### P2 (Medium): Fix Dropout Semantics Return proper status codes: - `504 Gateway Timeout` for timeouts - `503 Service Unavailable` for network failures - Or make it configurable: `dropout: { statusCode: 503 }` ### P3 (Low): Stateful Retry Safety Either: - Make retries use unique IDs (prevent duplicate creation) - Or document that resilience requires idempotent handlers - Or skip resilience for non-idempotent routes --- ## What We're Doing Instead Since current chaos doesn't serve our needs, we're writing application-layer failure tests: ```javascript test('Stripe rate limit handling', async () => { // Mock Stripe to return 429 app.stripe.paymentIntents.create = async () => { const err = new Error('Rate limit exceeded'); err.statusCode = 429; err.headers = { 'retry-after': '60' }; throw err; }; const res = await payInvoice({ invoiceId: 'test' }); assert.strictEqual(res.statusCode, 429); assert.strictEqual(res.json().data.error, 'stripe_rate_limit'); assert.strictEqual(res.headers['retry-after'], '60'); }); ``` This tests what we actually need: **handler behavior when dependencies fail.** --- ## Conclusion Apophis chaos is a good start for HTTP-layer resilience testing, but it's insufficient for production APIs with external dependencies. The framework needs to evolve from "HTTP response mutator" to "dependency failure simulator" to be truly valuable. We want Apophis to succeed. The schema-driven contract approach is innovative and valuable. But chaos testing needs to be dependency-aware to be useful for real-world APIs. **Happy to collaborate** on designing the outbound interception API or service wrapping approach. --- # Appendix: Concrete Proposals for Apophis Improvements ## Proposal 1: Conditional Schema Inference Instead of removing `const` from schemas, generate conditional contracts: ```typescript // Current behavior (WRONG): // Schema: { properties: { fragment_type: { const: "Action" } } } // Generates: response_body(this).fragment_type == "Action" // Applies to ALL responses // Proposed behavior: // Generates: if status:200 then response_body(this).fragment_type == "Action" else true ``` Implementation: ```typescript function inferContractsFromResponseSchema(responseSchema, statusCode) { const formulas = []; // ... existing inference logic ... // Wrap in conditional if status code is 2xx if (statusCode >= 200 && statusCode < 300) { return formulas.map(f => `if status:${statusCode} then ${f} else true`); } return formulas; } ``` ## Proposal 2: Configurable Invariants ```typescript // In test config const result = await app.apophis.contract({ invariants: ['resource-integrity'], // Opt-in specific invariants // Or invariants: false, // Disable all }); // Or per-route in schema schema: { 'x-invariants': ['resource-integrity'], 'x-invariants-exclude': ['no-orphaned-resources'] } ``` ## Proposal 3: Outbound Request Interception ```typescript // Apophis provides fetch/http client wrapper const stripeClient = apophis.createChaosAwareClient({ name: 'stripe', baseURL: 'https://api.stripe.com', defaults: { headers: { 'Authorization': `Bearer ${process.env.STRIPE_KEY}` } } }); // In chaos config chaos: { outbound: { 'stripe': { delay: { probability: 0.1, minMs: 1000, maxMs: 5000 }, error: { probability: 0.05, responses: [ { statusCode: 429, headers: { 'retry-after': '60' } }, { statusCode: 503, body: { error: 'stripe_unavailable' } } ] } } } } ``` Implementation approach: - Monkey-patch `fetch` or `http.request` at module level - Track outbound requests by hostname - Match against chaos config - Inject delays/errors before request reaches network ## Proposal 4: Service Method Wrapping ```typescript // After Fastify ready app.addHook('onReady', () => { apophis.chaos.wrap(app.billingService, { 'createPricingPlan': { delay: { probability: 0.1, ms: 100 }, error: { probability: 0.05, throws: new ServiceUnavailableError('stripe_timeout') } } }); }); ``` ## Proposal 5: Chaos Event Reporting ```typescript // In petit-runner, after chaos execution const chaosEvents = result.events || []; for (const event of chaosEvents) { results.push({ ok: true, // Chaos events are informational, not failures name: `${route.method} ${route.path} (chaos: ${event.type})`, diagnostics: { chaos: { injected: true, type: event.type, details: event.details } } }); } ``` ## Proposal 6: Dropout Semantics ```typescript // Configurable dropout behavior chaos: { dropout: { probability: 0.1, statusCode: 503, // Default: 503 instead of 0 body: { error: 'network_failure' } } } ``` ## Proposal 7: Hypermedia Contract Support ```typescript // New APOSTL operation headers response_body(this).controls.self == request_url(this) response_body(this).actions.update.method == "PATCH" response_body(this).actions.update.target == "/billing/plans/{response_body(this).data.id}" ``` Or schema annotation: ```json { "x-apophis-hypermedia": { "controls": ["self", "next", "prev"], "actions": ["create", "update", "delete"] } } ```