216 lines
10 KiB
Markdown
216 lines
10 KiB
Markdown
|
|
# APOPHIS Assessment: Arbiter Integration Readiness
|
||
|
|
|
||
|
|
## Executive Summary
|
||
|
|
|
||
|
|
APOPHIS is a contract-driven API testing plugin for Fastify. This document assesses its readiness for integration with the Arbiter repository (~11,389 routes, multi-tenant authorization server).
|
||
|
|
|
||
|
|
## What Is In Place
|
||
|
|
|
||
|
|
### Core Infrastructure (100% Complete)
|
||
|
|
- **Route Discovery**: Extracts contracts from Fastify route schemas via `discoverRoutes()`
|
||
|
|
- **Category Inference**: Auto-categorizes routes as constructor/mutator/observer/utility
|
||
|
|
- **Contract Extraction**: Parses `x-requires`, `x-ensures`, `x-invariants`, `x-regex`, `x-category`
|
||
|
|
- **Formula Parser**: Full APOSTL grammar with charCodeAt optimization (94% faster)
|
||
|
|
- **Formula Evaluator**: Pure function with type coercion, regex matching, quantifiers
|
||
|
|
- **Hook Validator**: Runtime precondition/postcondition validation via preHandler/onResponse
|
||
|
|
- **Scope Registry**: Auto-discovers from `APOPHIS_SCOPE_*` env vars
|
||
|
|
- **Cleanup Manager**: LIFO deletion with callback-based batching
|
||
|
|
- **TAP Formatter**: CI/CD compatible test output
|
||
|
|
|
||
|
|
### Test Framework (80% Complete)
|
||
|
|
- **PETIT Runner**: Property-based test execution with fast-check arbitraries
|
||
|
|
- **Schema-to-Arbitrary**: JSON Schema -> fast-check conversion (strings, integers, objects, arrays, enums, formats)
|
||
|
|
- **Incremental Cache**: SHA-256 schema hashing with file-based persistence (13-20x speedup)
|
||
|
|
- **Model State Tracking**: Basic resource tracking for constructor routes
|
||
|
|
|
||
|
|
### Performance (Complete)
|
||
|
|
- Route discovery: ~0.5µs/route
|
||
|
|
- Formula parsing: ~5µs/formula
|
||
|
|
- Category inference: ~15ns/route
|
||
|
|
- Contract extraction: 58% faster with WeakMap cache
|
||
|
|
- Incremental cache: 13-20x speedup for unchanged routes
|
||
|
|
- **Estimated 11K route overhead: ~1.4s total**
|
||
|
|
|
||
|
|
## What Is NOT In Place
|
||
|
|
|
||
|
|
### 1. Stateful Testing (0% - Architecture Only)
|
||
|
|
|
||
|
|
**Current State**: `runPetitTests` runs commands sequentially but without true stateful/model-based testing. The state machine only tracks created resources for cleanup.
|
||
|
|
|
||
|
|
**What's Missing**:
|
||
|
|
- **Command sequence generation**: Fast-check's `commands()` arbitrary for generating valid command sequences
|
||
|
|
- **Model-based state machine**: Formal model that tracks expected vs actual state
|
||
|
|
- **Precondition-aware sequencing**: Smart generation that respects `x-requires` dependencies
|
||
|
|
- **Cross-route state transitions**: Understanding that POST /users creates a resource that GET /users/:id can observe
|
||
|
|
- **Invariant checking across sequences**: Ensuring state remains consistent after mutations
|
||
|
|
|
||
|
|
**Arbiter-Specific Value**:
|
||
|
|
Arbiter has complex multi-tenant state:
|
||
|
|
- Tenant creation -> Application creation -> User creation -> Permission assignment
|
||
|
|
- OAuth flows: authorization -> token -> refresh -> revocation
|
||
|
|
- Graph mutations: node creation -> relation creation -> authorization evaluation
|
||
|
|
|
||
|
|
Stateful testing would catch:
|
||
|
|
- Race conditions in tenant isolation
|
||
|
|
- Invalid state transitions (e.g., deleting a tenant with active applications)
|
||
|
|
- Authorization leaks across state changes
|
||
|
|
- Resource lifecycle violations
|
||
|
|
|
||
|
|
**Implementation Effort**: Medium (2-3 days)
|
||
|
|
- Create `Model` class tracking expected state
|
||
|
|
- Implement `Command` arbitrary using fast-check's `commands()`
|
||
|
|
- Add `checkInvariants()` for cross-route consistency
|
||
|
|
- Implement `shrink()` for minimal failing sequences
|
||
|
|
|
||
|
|
### 2. Object Inference from Schemas (40%)
|
||
|
|
|
||
|
|
**Current State**: `updateState()` infers resources from response body looking for `id`/`uuid`/`_id` fields. This is naive.
|
||
|
|
|
||
|
|
**What's Missing**:
|
||
|
|
- **Schema-driven object extraction**: Using JSON Schema `properties` to know what fields constitute an object identity
|
||
|
|
- **Relationship inference**: Understanding that `POST /tenants/:id/applications` creates an application scoped to a tenant
|
||
|
|
- **Nested resource tracking**: Tracking sub-resources (e.g., application configs within tenants)
|
||
|
|
- **Path parameter correlation**: Linking `POST /users` response `id` to `GET /users/:id` path parameter
|
||
|
|
|
||
|
|
**Arbiter Example**:
|
||
|
|
```javascript
|
||
|
|
// POST /tenant/applications
|
||
|
|
// Response: { id: 'app-123', tenantId: 'tenant-456', name: 'My App' }
|
||
|
|
// Should infer: resourceType='application', parentType='tenant', parentId='tenant-456'
|
||
|
|
|
||
|
|
// Current code only captures: resourceType='applications', id='app-123'
|
||
|
|
// Missing the tenant scoping which is critical for Arbiter's authorization model
|
||
|
|
```
|
||
|
|
|
||
|
|
**Implementation Effort**: Low-Medium (1-2 days)
|
||
|
|
- Enhance `updateState()` to parse response schema for identity fields
|
||
|
|
- Add parent-child relationship tracking to `ModelState`
|
||
|
|
- Implement path parameter extraction for route correlation
|
||
|
|
|
||
|
|
### 3. Request Structure Inference (30%)
|
||
|
|
|
||
|
|
**Current State**: `executeCommand()` blindly sends all generated params as either body or query params based on HTTP method. No understanding of route-specific parameter structure.
|
||
|
|
|
||
|
|
**What's Missing**:
|
||
|
|
- **Path parameter extraction**: Identifying `:id`, `:tenantId` from route paths and correlating with generated data
|
||
|
|
- **Body vs query discrimination**: Using Fastify schema to know which params go where
|
||
|
|
- **Header injection**: Automatic `x-tenant-id`, `authorization` header injection based on route requirements
|
||
|
|
- **Nested body structures**: Handling `body.properties.nested.field` schemas
|
||
|
|
- **Content-Type negotiation**: Form-encoded vs JSON based on route configuration
|
||
|
|
|
||
|
|
**Arbiter Example**:
|
||
|
|
```javascript
|
||
|
|
// Route: POST /tenant/applications/:appId/rules
|
||
|
|
// Body schema: { type: 'object', properties: { dsl: { type: 'string' }, priority: { type: 'integer' } } }
|
||
|
|
// Path params: { appId: '...' }
|
||
|
|
// Headers: { 'x-tenant-id': '...', 'authorization': 'Bearer ...' }
|
||
|
|
|
||
|
|
// Current code would send: { appId: 'generated', dsl: 'generated', priority: 1 } all as body
|
||
|
|
// Should send: appId in path, { dsl, priority } in body, auth headers automatically
|
||
|
|
```
|
||
|
|
|
||
|
|
**Implementation Effort**: Medium (2-3 days)
|
||
|
|
- Parse route path for parameter placeholders
|
||
|
|
- Match generated data to path vs body vs query
|
||
|
|
- Implement header injection based on scope/auth requirements
|
||
|
|
- Handle nested schema structures
|
||
|
|
|
||
|
|
### 4. Logic/Invariant Analysis (20%)
|
||
|
|
|
||
|
|
**Current State**: `checkPostconditions()` only validates `status:###` patterns. No evaluation of complex invariants.
|
||
|
|
|
||
|
|
**What's Missing**:
|
||
|
|
- **Cross-route invariant checking**: "After POST /users, GET /users/:id should return the same user"
|
||
|
|
- **State consistency checks**: "Total user count should increase by 1 after creation"
|
||
|
|
- **Authorization boundary checks**: "Tenant A's admin cannot access Tenant B's resources"
|
||
|
|
- **Temporal logic**: "After DELETE /users/:id, subsequent GET should return 404"
|
||
|
|
- **Mathematical invariants**: Budget constraints, quota limits, rate limiting
|
||
|
|
|
||
|
|
**Arbiter-Specific Value**:
|
||
|
|
Arbiter's authorization graph has rich invariants:
|
||
|
|
- If user U has permission P on resource R, then checking P for U on R must return true
|
||
|
|
- If node N is child of node M, then M's permissions apply to N (transitivity)
|
||
|
|
- If relation R is revoked, all derived permissions via R must be invalidated
|
||
|
|
- Tenant isolation: resources in tenant T1 must never be accessible from T2
|
||
|
|
|
||
|
|
**Implementation Effort**: High (1 week)
|
||
|
|
- Implement invariant registry for cross-route assertions
|
||
|
|
- Add temporal operators (eventually, always, until) to APOSTL
|
||
|
|
- Create graph-aware consistency checker for Arbiter's authorization model
|
||
|
|
- Implement property-based invariant generation from schema constraints
|
||
|
|
|
||
|
|
### 5. Documentation (70%)
|
||
|
|
|
||
|
|
**In Place**:
|
||
|
|
- README.md with quick start, features, API reference
|
||
|
|
- Architecture document (ARCHITECTURE, 2656 lines)
|
||
|
|
- Performance analysis (PERF_ANALYSIS.md)
|
||
|
|
- Inline code comments
|
||
|
|
|
||
|
|
**Missing**:
|
||
|
|
- **skills.md**: LLM-friendly documentation for AI-assisted development
|
||
|
|
- **Advanced guides**: Stateful testing setup, custom invariant authoring
|
||
|
|
- **Arbiter-specific examples**: Multi-tenant testing patterns, OAuth flow validation
|
||
|
|
- **Troubleshooting guide**: Common failures, debugging techniques
|
||
|
|
- **Migration guide**: From manual testing to contract-driven testing
|
||
|
|
|
||
|
|
## Do We Gain from Logic?
|
||
|
|
|
||
|
|
### Short Answer: YES, Significantly
|
||
|
|
|
||
|
|
Without logic/stateful testing, APOPHIS is essentially a smart fuzzer with runtime assertions. With logic:
|
||
|
|
|
||
|
|
1. **State Space Coverage**:
|
||
|
|
- Stateless: Tests each route in isolation (~200 tests for 200 routes)
|
||
|
|
- Stateful: Tests route sequences (200 routes ^ 5 depth = 3.2 billion sequences)
|
||
|
|
- **Gain**: 10-100x more bugs found in stateful interactions
|
||
|
|
|
||
|
|
2. **Arbiter-Specific Bugs Caught**:
|
||
|
|
- Authorization escalation after role changes
|
||
|
|
- Resource leaks across tenant boundaries
|
||
|
|
- Invalid state transitions (e.g., modifying revoked tokens)
|
||
|
|
- Cache invalidation failures after mutations
|
||
|
|
- Graph inconsistency after node deletion
|
||
|
|
|
||
|
|
3. **Regression Prevention**:
|
||
|
|
- Stateless: Catches route-level regressions
|
||
|
|
- Stateful: Catches system-level regressions (e.g., "deleting user breaks their sessions")
|
||
|
|
|
||
|
|
4. **Cost-Benefit**:
|
||
|
|
- Implementation: ~1 week
|
||
|
|
- Value: Prevents production incidents that could take days to debug
|
||
|
|
- ROI: 10x+ for a system like Arbiter
|
||
|
|
|
||
|
|
## Recommendations
|
||
|
|
|
||
|
|
### Phase 1: Immediate (This Week)
|
||
|
|
1. Implement object inference from schemas (1-2 days)
|
||
|
|
2. Fix request structure handling (path/body/query discrimination) (2-3 days)
|
||
|
|
3. Create skills.md for LLM assistance (1 day)
|
||
|
|
|
||
|
|
### Phase 2: Short-term (Next 2 Weeks)
|
||
|
|
1. Implement stateful test runner with model-based testing (1 week)
|
||
|
|
2. Add cross-route invariant checking (1 week)
|
||
|
|
3. Create Arbiter-specific example suite
|
||
|
|
|
||
|
|
### Phase 3: Medium-term (Next Month)
|
||
|
|
1. Graph-aware consistency checker for Arbiter
|
||
|
|
2. Automatic contract generation from existing tests
|
||
|
|
3. Performance optimization for 11K routes
|
||
|
|
4. Integration with Arbiter's CI/CD pipeline
|
||
|
|
|
||
|
|
## Conclusion
|
||
|
|
|
||
|
|
APOPHIS has a solid foundation for contract-driven testing. The current implementation provides immediate value for:
|
||
|
|
- Runtime contract validation (preconditions/postconditions)
|
||
|
|
- Property-based testing of individual routes
|
||
|
|
- Incremental test execution for CI/CD
|
||
|
|
|
||
|
|
However, to fully realize value for Arbiter, we need:
|
||
|
|
1. **Stateful testing**: Critical for catching multi-route interaction bugs
|
||
|
|
2. **Better object inference**: Essential for Arbiter's complex resource hierarchies
|
||
|
|
3. **Request structure handling**: Required for realistic test execution
|
||
|
|
4. **Logic/invariant analysis**: Needed for authorization-specific testing
|
||
|
|
|
||
|
|
The **highest ROI** item is stateful testing with proper object inference, which would catch the class of bugs most likely to cause production incidents in Arbiter.
|