Files
apophis-fastify/docs/attic/root-history/ASSESSMENT.md
T

10 KiB

APOPHIS Assessment: Arbiter Integration Readiness

Executive Summary

APOPHIS is a contract-driven API testing plugin for Fastify. This document assesses its readiness for integration with the Arbiter repository (~11,389 routes, multi-tenant authorization server).

What Is In Place

Core Infrastructure (100% Complete)

  • Route Discovery: Extracts contracts from Fastify route schemas via discoverRoutes()
  • Category Inference: Auto-categorizes routes as constructor/mutator/observer/utility
  • Contract Extraction: Parses x-requires, x-ensures, x-invariants, x-regex, x-category
  • Formula Parser: Full APOSTL grammar with charCodeAt optimization (94% faster)
  • Formula Evaluator: Pure function with type coercion, regex matching, quantifiers
  • Hook Validator: Runtime precondition/postcondition validation via preHandler/onResponse
  • Scope Registry: Auto-discovers from APOPHIS_SCOPE_* env vars
  • Cleanup Manager: LIFO deletion with callback-based batching
  • TAP Formatter: CI/CD compatible test output

Test Framework (80% Complete)

  • PETIT Runner: Property-based test execution with fast-check arbitraries
  • Schema-to-Arbitrary: JSON Schema -> fast-check conversion (strings, integers, objects, arrays, enums, formats)
  • Incremental Cache: SHA-256 schema hashing with file-based persistence (13-20x speedup)
  • Model State Tracking: Basic resource tracking for constructor routes

Performance (Complete)

  • Route discovery: ~0.5µs/route
  • Formula parsing: ~5µs/formula
  • Category inference: ~15ns/route
  • Contract extraction: 58% faster with WeakMap cache
  • Incremental cache: 13-20x speedup for unchanged routes
  • Estimated 11K route overhead: ~1.4s total

What Is NOT In Place

1. Stateful Testing (0% - Architecture Only)

Current State: runPetitTests runs commands sequentially but without true stateful/model-based testing. The state machine only tracks created resources for cleanup.

What's Missing:

  • Command sequence generation: Fast-check's commands() arbitrary for generating valid command sequences
  • Model-based state machine: Formal model that tracks expected vs actual state
  • Precondition-aware sequencing: Smart generation that respects x-requires dependencies
  • Cross-route state transitions: Understanding that POST /users creates a resource that GET /users/:id can observe
  • Invariant checking across sequences: Ensuring state remains consistent after mutations

Arbiter-Specific Value: Arbiter has complex multi-tenant state:

  • Tenant creation -> Application creation -> User creation -> Permission assignment
  • OAuth flows: authorization -> token -> refresh -> revocation
  • Graph mutations: node creation -> relation creation -> authorization evaluation

Stateful testing would catch:

  • Race conditions in tenant isolation
  • Invalid state transitions (e.g., deleting a tenant with active applications)
  • Authorization leaks across state changes
  • Resource lifecycle violations

Implementation Effort: Medium (2-3 days)

  • Create Model class tracking expected state
  • Implement Command arbitrary using fast-check's commands()
  • Add checkInvariants() for cross-route consistency
  • Implement shrink() for minimal failing sequences

2. Object Inference from Schemas (40%)

Current State: updateState() infers resources from response body looking for id/uuid/_id fields. This is naive.

What's Missing:

  • Schema-driven object extraction: Using JSON Schema properties to know what fields constitute an object identity
  • Relationship inference: Understanding that POST /tenants/:id/applications creates an application scoped to a tenant
  • Nested resource tracking: Tracking sub-resources (e.g., application configs within tenants)
  • Path parameter correlation: Linking POST /users response id to GET /users/:id path parameter

Arbiter Example:

// POST /tenant/applications
// Response: { id: 'app-123', tenantId: 'tenant-456', name: 'My App' }
// Should infer: resourceType='application', parentType='tenant', parentId='tenant-456'

// Current code only captures: resourceType='applications', id='app-123'
// Missing the tenant scoping which is critical for Arbiter's authorization model

Implementation Effort: Low-Medium (1-2 days)

  • Enhance updateState() to parse response schema for identity fields
  • Add parent-child relationship tracking to ModelState
  • Implement path parameter extraction for route correlation

3. Request Structure Inference (30%)

Current State: executeCommand() blindly sends all generated params as either body or query params based on HTTP method. No understanding of route-specific parameter structure.

What's Missing:

  • Path parameter extraction: Identifying :id, :tenantId from route paths and correlating with generated data
  • Body vs query discrimination: Using Fastify schema to know which params go where
  • Header injection: Automatic x-tenant-id, authorization header injection based on route requirements
  • Nested body structures: Handling body.properties.nested.field schemas
  • Content-Type negotiation: Form-encoded vs JSON based on route configuration

Arbiter Example:

// Route: POST /tenant/applications/:appId/rules
// Body schema: { type: 'object', properties: { dsl: { type: 'string' }, priority: { type: 'integer' } } }
// Path params: { appId: '...' }
// Headers: { 'x-tenant-id': '...', 'authorization': 'Bearer ...' }

// Current code would send: { appId: 'generated', dsl: 'generated', priority: 1 } all as body
// Should send: appId in path, { dsl, priority } in body, auth headers automatically

Implementation Effort: Medium (2-3 days)

  • Parse route path for parameter placeholders
  • Match generated data to path vs body vs query
  • Implement header injection based on scope/auth requirements
  • Handle nested schema structures

4. Logic/Invariant Analysis (20%)

Current State: checkPostconditions() only validates status:### patterns. No evaluation of complex invariants.

What's Missing:

  • Cross-route invariant checking: "After POST /users, GET /users/:id should return the same user"
  • State consistency checks: "Total user count should increase by 1 after creation"
  • Authorization boundary checks: "Tenant A's admin cannot access Tenant B's resources"
  • Temporal logic: "After DELETE /users/:id, subsequent GET should return 404"
  • Mathematical invariants: Budget constraints, quota limits, rate limiting

Arbiter-Specific Value: Arbiter's authorization graph has rich invariants:

  • If user U has permission P on resource R, then checking P for U on R must return true
  • If node N is child of node M, then M's permissions apply to N (transitivity)
  • If relation R is revoked, all derived permissions via R must be invalidated
  • Tenant isolation: resources in tenant T1 must never be accessible from T2

Implementation Effort: High (1 week)

  • Implement invariant registry for cross-route assertions
  • Add temporal operators (eventually, always, until) to APOSTL
  • Create graph-aware consistency checker for Arbiter's authorization model
  • Implement property-based invariant generation from schema constraints

5. Documentation (70%)

In Place:

  • README.md with quick start, features, API reference
  • Architecture document (ARCHITECTURE, 2656 lines)
  • Performance analysis (PERF_ANALYSIS.md)
  • Inline code comments

Missing:

  • skills.md: LLM-friendly documentation for AI-assisted development
  • Advanced guides: Stateful testing setup, custom invariant authoring
  • Arbiter-specific examples: Multi-tenant testing patterns, OAuth flow validation
  • Troubleshooting guide: Common failures, debugging techniques
  • Migration guide: From manual testing to contract-driven testing

Do We Gain from Logic?

Short Answer: YES, Significantly

Without logic/stateful testing, APOPHIS is essentially a smart fuzzer with runtime assertions. With logic:

  1. State Space Coverage:

    • Stateless: Tests each route in isolation (~200 tests for 200 routes)
    • Stateful: Tests route sequences (200 routes ^ 5 depth = 3.2 billion sequences)
    • Gain: 10-100x more bugs found in stateful interactions
  2. Arbiter-Specific Bugs Caught:

    • Authorization escalation after role changes
    • Resource leaks across tenant boundaries
    • Invalid state transitions (e.g., modifying revoked tokens)
    • Cache invalidation failures after mutations
    • Graph inconsistency after node deletion
  3. Regression Prevention:

    • Stateless: Catches route-level regressions
    • Stateful: Catches system-level regressions (e.g., "deleting user breaks their sessions")
  4. Cost-Benefit:

    • Implementation: ~1 week
    • Value: Prevents production incidents that could take days to debug
    • ROI: 10x+ for a system like Arbiter

Recommendations

Phase 1: Immediate (This Week)

  1. Implement object inference from schemas (1-2 days)
  2. Fix request structure handling (path/body/query discrimination) (2-3 days)
  3. Create skills.md for LLM assistance (1 day)

Phase 2: Short-term (Next 2 Weeks)

  1. Implement stateful test runner with model-based testing (1 week)
  2. Add cross-route invariant checking (1 week)
  3. Create Arbiter-specific example suite

Phase 3: Medium-term (Next Month)

  1. Graph-aware consistency checker for Arbiter
  2. Automatic contract generation from existing tests
  3. Performance optimization for 11K routes
  4. Integration with Arbiter's CI/CD pipeline

Conclusion

APOPHIS has a solid foundation for contract-driven testing. The current implementation provides immediate value for:

  • Runtime contract validation (preconditions/postconditions)
  • Property-based testing of individual routes
  • Incremental test execution for CI/CD

However, to fully realize value for Arbiter, we need:

  1. Stateful testing: Critical for catching multi-route interaction bugs
  2. Better object inference: Essential for Arbiter's complex resource hierarchies
  3. Request structure handling: Required for realistic test execution
  4. Logic/invariant analysis: Needed for authorization-specific testing

The highest ROI item is stateful testing with proper object inference, which would catch the class of bugs most likely to cause production incidents in Arbiter.