APOPHIS Test Quality Audit Report
Date: 2026-04-29
Scope: 55 test files, ~20,450 lines
Auditors: 3 parallel subworkers (CLI tests, Domain/Core tests, Feature tests)
Executive Summary
| Category |
Count |
Lines |
Verdict |
| CLI Tests |
18 files |
~9,209 lines |
10 KEEP, 3 MERGE, 4 REFACTOR, 1 DELETE |
| Domain/Core Tests |
11 files |
~4,500 lines |
8 KEEP, 1 MERGE, 2 REFACTOR |
| Feature Tests |
26 files |
~6,741 lines |
20 KEEP, 2 MERGE, 4 REFACTOR, 3 DELETE |
| Total |
55 files |
~20,450 lines |
38 KEEP, 6 MERGE, 10 REFACTOR, 4 DELETE |
Key Findings:
- 4 test files test non-production helpers (cascade-validator, hypermedia-validator, etc.)
- 6 files have significant overlap with other tests
- 10 files need refactoring (temp app approach broken, implementation testing, weak assertions)
- 38 files provide unique, valuable coverage
Critical Issues (Fix First)
1. Broken Test Approach: verify-ux.test.ts
- Status: 16 of 20 tests FAIL (80% failure rate)
- Root cause: Creates temp app.js files that aren't valid Fastify apps
- Impact: Unreliable regression protection
- Fix: Switch to fixture apps (
src/cli/__fixtures__/) or create new fixtures
2. Duplicate Tests: integration.test.ts
- Status: 3 pairs of duplicate/near-duplicate tests (6 tests)
- Impact: Wasted CI time, no added coverage
- Fix: Remove duplicates
3. Non-Production Helpers: cascade-validator.test.ts, hypermedia-validator.test.ts
- Status: Test helpers that were merged into test files, never imported by production code
- Impact: Test maintenance burden for dead code
- Fix: Delete (production coverage exists in
relationships.test.ts)
4. Inline Copies: deduplication.test.ts
- Status: Contains stale copies of
deduplicatePetit/deduplicateStateful
- Impact: Tests don't exercise actual production code
- Fix: Import from
runner-utils.ts instead
CLI Test Audit (18 files)
KEEP (10 files)
| File |
Tests |
Value |
Why |
docs-smoke.test.ts |
4 |
Unique |
Only test verifying documentation accuracy |
goldens.test.ts |
9 |
High |
Guards CLI output against accidental changes |
init.test.ts |
17 |
Unique |
Only deep init coverage |
latency.test.ts |
5 |
Unique |
Performance regression guards |
migrate-reliability.test.ts |
20 |
Unique |
Canonical migrate test, 80% coverage |
observe-safety.test.ts |
20 |
Unique |
Only policy engine + observe integration |
packaging.test.ts |
15 |
Unique |
Only test of built binary |
qualify-signal.test.ts |
16 |
Unique |
Only artifact structure validation |
renderers.test.ts |
18 |
Unique |
Only renderer function tests |
replay-integrity.test.ts |
10 |
Unique |
Only replay loader/schema tests |
MERGE (3 files)
| File |
Target |
Reason |
core.test.ts |
dispatch.test.ts |
Tests same CLI entrypoint, weaker assertions |
migrate.test.ts |
migrate-reliability.test.ts |
Subset coverage, 15 tests vs 20 |
observe.test.ts |
observe-safety.test.ts |
Keep fixture-based tests only |
REFACTOR (4 files)
| File |
Issue |
Fix |
acceptance.test.ts |
8 tests fail due to fixture instability |
Use main() entrypoint, drop failing tests |
config-validation.test.ts |
271 tests, many permutations |
Collapse to ~50 parameterized tests |
doctor-consistency.test.ts |
5 tests fail (temp apps not valid) |
Use fixture apps instead |
verify-ux.test.ts |
16 of 20 tests fail |
Switch to fixture apps |
DELETE (after merge)
core.test.ts → merged into dispatch
migrate.test.ts → merged into migrate-reliability
observe.test.ts → merged into observe-safety
Domain/Core Test Audit (11 files)
KEEP (8 files)
| File |
Tests |
Value |
domain.test.ts |
45 |
Foundational classification rules |
formula.test.ts |
~85 |
Core parser/evaluator, property tests |
extension.test.ts |
36 |
Registry/framework, no overlap |
infrastructure.test.ts |
15 |
ScopeRegistry, CleanupManager, HookValidator |
error-context.test.ts |
24 |
Core contract validation |
error-suggestions.test.ts |
31 |
Exhaustive suggestion branches |
cross-operation-support.test.ts |
8 |
Only integration tests for previous() |
protocol-extensions.test.ts |
22 |
Built-in extensions |
MERGE (1 file)
| File |
Target |
Reason |
examples.test.ts |
integration.test.ts |
Redundant smoke tests |
REFACTOR (2 files)
| File |
Issue |
Fix |
integration.test.ts |
6 duplicate/near-duplicate tests |
Remove duplicates |
success-metrics.test.ts |
Arbitrary thresholds, covered elsewhere |
Delete (assertions in error-context + integration) |
Feature Test Audit (26 files)
KEEP (20 files)
| File |
Tests |
Value |
cache-hints.test.ts |
7 |
Cache invalidation patterns |
counterexample.test.ts |
17 |
Failure analysis + formatting |
debug-mode.test.ts |
2 |
Debug logging toggle |
incremental.test.ts |
12 |
Hash determinism |
incremental/cache.test.ts |
7 |
Cache API round-trip |
invariant-registry.test.ts |
5 |
Invariant resolution |
outbound-interceptor.test.ts |
16 |
Chaos application |
outbound-runtime.test.ts |
10 |
Outbound registry + mocks |
outbound-stateful.test.ts |
7 |
Stateful mock CRUD |
production-safety.test.ts |
4 |
Production guards |
regex-guard.test.ts |
13 |
ReDoS protection |
relationships.test.ts |
9 |
Production relationship predicates |
resource-inference.test.ts |
13 |
Schema-driven identity |
route-matcher.test.ts |
17 |
URL pattern matching |
scenario-runner.test.ts |
6 |
Scenario capture/rebind/cookies |
schema-to-arbitrary.test.ts |
33 |
Schema-to-fast-check (property tests) |
scope-isolation.test.ts |
4 |
Scope filtering |
serverless.test.ts |
3 |
Serverless compatibility |
stateful-runner.test.ts |
6 |
Stateful test execution |
tap-formatter.test.ts |
15 |
TAP output formatting |
MERGE (2 files)
| File |
Target |
Reason |
format-diff.test.ts |
counterexample.test.ts |
Only 4 tests, same module |
seeded-rng.test.ts |
schema-to-arbitrary.test.ts |
5 tests, RNG core to generation |
REFACTOR (4 files)
| File |
Issue |
Fix |
deduplication.test.ts |
Stale copies of production code |
Import from runner-utils.ts |
incremental/cache.test.ts |
Weak "persists to disk" test |
Fix or remove |
counterexample.test.ts |
Growing file (224L) |
Split if exceeds 250L |
tap-formatter.test.ts |
Same module as counterexample |
Consider unified formatters.test.ts |
DELETE (4 files)
| File |
Reason |
Coverage Moves To |
cascade-validator.test.ts |
Tests non-production helpers |
relationships.test.ts |
hypermedia-validator.test.ts |
Tests non-production helpers |
relationships.test.ts |
gap-fixes.test.ts |
Runtime hooks → infrastructure, chaos → outbound-interceptor |
infrastructure.test.ts, outbound-interceptor.test.ts |
success-metrics.test.ts |
Arbitrary metrics, covered elsewhere |
error-context.test.ts, integration.test.ts |
Action Plan
Phase A: Fix Broken Tests (Week 1)
- Refactor
verify-ux.test.ts - Switch to fixture apps
- Refactor
doctor-consistency.test.ts - Use fixture apps for failing tests
- Refactor
acceptance.test.ts - Remove failing tests, use main() entrypoint
- Remove duplicates from
integration.test.ts - 6 tests
Phase B: Delete Dead Tests (Week 1)
- Delete
cascade-validator.test.ts
- Delete
hypermedia-validator.test.ts
- Delete
gap-fixes.test.ts (after moving valuable tests)
- Delete
success-metrics.test.ts
Phase C: Merge Overlapping Tests (Week 2)
- Merge
core.test.ts → dispatch.test.ts
- Merge
migrate.test.ts → migrate-reliability.test.ts
- Merge
observe.test.ts → observe-safety.test.ts
- Merge
examples.test.ts → integration.test.ts
- Merge
format-diff.test.ts → counterexample.test.ts
- Merge
seeded-rng.test.ts → schema-to-arbitrary.test.ts
Phase D: Refactor Implementation Tests (Week 2)
- Refactor
deduplication.test.ts - Use real imports
- Refactor
config-validation.test.ts - Parameterize permutations
- Fix
incremental/cache.test.ts - Strengthen or remove weak test
Impact Projection
| Metric |
Current |
After |
Change |
| Test files |
55 |
~45 |
-10 (-18%) |
| Test lines |
~20,450 |
~18,000 |
-2,450 (-12%) |
| Failing tests |
~20 |
0 |
-20 (100%) |
| Duplicate tests |
~15 |
0 |
-15 (100%) |
| Non-production tests |
4 files |
0 |
-4 (100%) |
Coverage target: Retain or move the useful assertions before deleting overlapping tests.
Test Quality Principles Applied
- Behavior over implementation - Tests should verify observable behavior, not internal structure
- Fixtures over temp files - Use stable fixture apps instead of generating temp app.js files
- Parameterized over permutations - One test with multiple inputs beats 10 identical tests
- Production over helpers - Test production code, not test-only helpers
- Independence - Each test should create its own context, not depend on global state
Report generated from static analysis of all 55 test files. No code changes made.