Files
apophis-fastify/docs/attic/root-history/FEEDBACK_FROM_ARBITER.md
T

12 KiB

Arbiter → Apophis Feedback Report

Date: 2026-04-27 Reporter: Arbiter Engineering Team Context: Integration of Apophis v2.2 into Arbiter Platform for behavioral contract testing


Executive Summary

Apophis provides genuinely valuable capabilities for behavioral contract testing that go beyond traditional unit/integration tests. The schema-to-contract inference, cross-operation verification, and chaos testing infrastructure are compelling. However, we encountered 3 bugs in core infrastructure and several design friction points that should be addressed for wider adoption.

Overall Assessment: Strong value proposition for teams willing to invest in schema-driven testing. Needs polish on edge cases and configurability.


Part 1: How Chaos Injection Would Help Arbiter

Current State

Arbiter is a multi-tenant SaaS platform with:

  • 500+ API endpoints across 15 route families
  • Billing, graph storage, auth, sessions, webhooks, etc.
  • Mock Stripe integration for payment processing
  • In-memory and persistent storage backends
  • Complex middleware chain: auth → tenant boundary → permissions → preflight → handler

Where Chaos Testing Adds Value

1. Middleware Resilience Verification

Our middleware chain has implicit dependencies:

Transport → AuthN → Scope → AuthZ → Challenge → Preflight → Handler

Chaos testing would verify:

  • What happens when preflight() times out? Does the handler still execute?
  • If auth middleware fails with 503, do we get proper retry headers?
  • Does a slow tenant boundary check cascade to response timeouts?

Concrete scenario: If the billing preflight gate (budget check) is slow, does the subscription creation handler wait or fail? Our contracts say response_time < 2000ms — chaos would tell us if that's actually enforced.

2. Mock Service Degradation

We use MockStripeService for payment processing. In production, Stripe can:

  • Return 429 (rate limit)
  • Time out on paymentIntents.create
  • Return network errors

Chaos testing would inject:

if chaos:stripe-timeout then response_code == 503
if chaos:stripe-rate-limit then retry-after header != null

This validates our fallback logic — currently untested because mocks always succeed.

3. Resource Leak Detection

Our BillingApplicationService uses in-memory Maps. Chaos scenarios:

  • Create 1000 plans, delete 500, verify GET on deleted returns 404
  • Cancel subscriptions mid-renewal cycle
  • Concurrent PATCH operations on same plan

Cross-operation contracts catch this for single requests, but chaos tests concurrent state corruption.

4. Entitlement Boundary Testing

We have credit-based preflight gates. Chaos could:

  • Exhaust credits mid-test
  • Verify 402 (Payment Required) is returned
  • Ensure no partial mutations occur when budget is depleted

This is business-critical: we cannot bill customers for operations that fail.

5. Auth Token Expiry

JWT tokens expire. Chaos could:

  • Expire tokens between POST and follow-up GET
  • Verify 401 with proper WWW-Authenticate header
  • Test refresh token flow under load

Proposed Chaos Scenarios for Arbiter

billing_chaos:
  - name: stripe-timeout
    target: POST /billing/invoices/:id/pay
    inject: { stripe_delay_ms: 5000 }
    expected: { status: 503, retry_after: "> 0" }
  
  - name: storage-corruption
    target: DELETE /billing/plans/:id
    inject: { skip_deletion: true }
    expected: { status: 200, follow_up_get: 404 }
  
  - name: rate-limit
    target: POST /billing/plans
    inject: { rate_limit: 10 }
    expected: { status: 429, x_retry_after: "> 0" }
  
  - name: auth-expiry
    target: PATCH /billing/plans/:id
    inject: { expire_token_after_ms: 100 }
    expected: { status: 401, www_authenticate: "Bearer" }

Part 2: Bugs Found

Bug 1: Scope Registry Ignores Configured Default Scope

Severity: High (breaks auth in cross-operation tests) File: dist/infrastructure/scope-registry.js Line: 60, 76-77

Problem:

const scope = scopeName !== null ? this.scopes.get(scopeName) : undefined;
const base = scope ?? this.defaultScope;  // Always uses empty DEFAULT_SCOPE

When getHeaders(null) is called, it uses this.defaultScope which is initialized to { headers: {}, metadata: {} } on line 60, ignoring any "default" scope passed in the constructor.

Impact: Cross-operation requests (e.g., response_code(GET /users/{id})) don't inherit auth headers from the configured scope, causing 401 failures on protected routes.

Fix:

const base = scope ?? this.scopes.get('default') ?? this.defaultScope;

Reproduction:

await app.register(apophis, {
  scopes: {
    default: { headers: { 'authorization': 'Bearer token' } }
  }
});
// Cross-operation GET /users/123 gets 401 because auth header is not passed

Bug 2: Contract Builder Drops Routes Option

Severity: High (route filtering doesn't work) File: dist/plugin/contract-builder.js Line: 8-15

Problem:

const config = {
    depth: opts.depth ?? 'standard',
    scope: opts.scope,
    seed: opts.seed,
    timeout: opts.timeout,
    chaos: opts.chaos,
    // Missing: routes: opts.routes
};

The routes option is documented but never passed to runPetitTests, causing all routes to be tested regardless of the routes filter.

Impact: Tests run against all 500+ routes instead of the 4 specified, making debugging impossible and CI times explode.

Fix:

const config = {
    depth: opts.depth ?? 'standard',
    scope: opts.scope,
    seed: opts.seed,
    timeout: opts.timeout,
    chaos: opts.chaos,
    routes: opts.routes,  // Add this
};

Reproduction:

await app.apophis.contract({
  routes: ['POST /billing/plans']  // Tests ALL routes instead
});

Bug 3: Invariant Checking Not Configurable

Severity: Medium (false failures for non-hierarchical APIs) File: dist/test/petit-runner.js Line: 386-398

Problem: Built-in invariants (no-orphaned-resources, parent-reference-integrity, resource-integrity) run unconditionally for all routes. These assume parent-child resource hierarchies (e.g., /workspaces/:id/projects/:id).

Impact: For flat resource models (like our billing plans), routes with x-category: 'constructor' trigger invariant failures because resources don't have parentType/parentId.

Workaround: We set x-category: 'observer' to avoid resource tracking, but this loses the semantic meaning of the route.

Suggested Fix:

// In config
invariants: ['resource-integrity']  // Opt-in per test
// Or
invariants: false  // Disable all
// Or per-route
schema: {
  'x-invariants': ['custom-only']
}

Part 3: Design Feedback

1. Schema Inference is Too Aggressive

Issue: const values in JSON Schema generate unconditional contracts.

Example:

{
  "response": {
    "200": {
      "properties": {
        "fragment_type": { "const": "Action" }
      }
    }
  }
}

Generates: response_body(this).fragment_type == "Action" (checked for ALL responses)

This fails when the route returns 404 with fragment_type: "Error".

Suggestion: Infer conditional contracts based on status code:

if status:200 then response_body(this).fragment_type == "Action" else true

Or add an option to disable inference: inferContracts: false.

2. Cross-Operation Headers Not Documented

The scope.headers behavior for cross-operation requests is not documented. We had to read source code to discover that:

  • createOperationResolver(fastify, request.headers) passes request headers
  • But request.headers comes from scope.getHeaders(null)
  • Which had bug #1 above

Suggestion: Document that cross-operation requests inherit the scope headers of the original request.

3. Missing 400 Response Handling

When Fastify schema validation fails (e.g., enum mismatch), it returns 400 with a validation error object. Apophis treats this as a contract failure unless:

  • The schema has a 400 response documented
  • The contract explicitly accepts 400

Most developers won't document 400 responses. Apophis should either:

  • Auto-generate 400 contracts from validation rules
  • Or provide a global 400 handler pattern

4. HEAD Routes Cause Noise

Fastify auto-generates HEAD routes for every GET. These have no response body, causing response_body(this).id != null failures.

Suggestion: Auto-skip HEAD routes in contract tests, or provide skipMethods: ['HEAD'] option.

5. Error Suggestions Need Context

When a contract fails, the error is:

Field 'fragment_type' does not match expected value 'Error'.

But it doesn't say:

  • What the actual status code was
  • What the actual response body was
  • Which route generated the request

Suggestion: Include actual vs expected in violation objects.


Part 4: What We Love

1. Cross-Operation Contracts

if status:201 then response_code(GET /billing/plans/{response_body(this).data.plan_id}) == 200 else true

This is genuinely hard to test manually. Apophis makes it declarative and automatic.

2. Property-Based Generation

Fast-check found edge cases we missed:

  • Empty string name (schema allowed it, service rejected it)
  • Invalid billing_interval values
  • Missing required fields

3. Schema as Single Source of Truth

Once schemas are correct, contracts are free. The x-ensures array supplements rather than replaces schema validation.

4. Fast Feedback Loop

Contract tests run in ~1.5s for 4 routes. Much faster than spinning up a full test environment.


Part 5: Feature Requests

1. Hypermedia Contract Support

Arbiter returns LDF (Linked Data Fragment) responses with controls and actions. We'd love to verify:

if status:200 then response_body(this).controls.self == request_url(this) else true
if status:200 then response_body(this).actions.create.method == "POST" else true
if status:200 then response_body(this).actions.update.target == "/billing/plans/{response_body(this).data.id}" else true

Currently we have to write these manually. Could Apophis infer hypermedia controls from route registration?

2. Conditional Schema Contracts

Instead of removing const from schemas, allow:

{
  "response": {
    "200": {
      "properties": {
        "fragment_type": { "const": "Action", "x-apophis-conditional": "status:200" }
      }
    }
  }
}

This preserves schema expressiveness while generating correct contracts.

3. Middleware Contract Verification

Our middleware chain is critical. We'd like to verify:

if request_headers(this).authorization == null then status:401 else true
if request_headers(this).x-tenant-id == null then status:400 else true

Apophis already supports request_headers — making this a first-class feature (e.g., x-requires) would be powerful.

4. State Cleanup Hooks

After destructive tests (DELETE), we need to clean up:

await app.apophis.contract({
  routes: ['DELETE /billing/plans/:id'],
  cleanup: async (state) => {
    // Remove created plans from database
    await db.plans.deleteMany({ id: { $in: state.createdPlans } });
  }
});

This would enable stateful testing without polluting the test environment.

5. Contract Coverage Report

After running tests, we'd like:

Contract Coverage:
  POST /billing/plans:
    - 201 response: ✓ tested (42 cases)
    - 400 response: ✓ tested (8 cases)
    - 503 response: ✗ not tested
    - Cross-op GET: ✓ tested (42 cases)

This helps identify gaps in contract coverage.


Conclusion

Apophis is a powerful tool that fills a gap in API testing — behavioral contracts and chaos testing. The core concepts are solid, but the implementation needs hardening for production use:

Must-fix: Bugs #1 and #2 (scope registry, route filtering) Should-fix: Bug #3 (configurable invariants), inference aggressiveness Nice-to-have: Hypermedia support, middleware contracts, coverage reports

We're committed to using Apophis for Arbiter's contract testing and will contribute fixes upstream. The value of cross-operation verification alone justifies the investment.


Contact: Arbiter Engineering Team Repository: https://github.com/anomalyco/apophis (we'll open issues for each bug)