8.2 KiB
APOPHIS Enforce-Readiness Hardening List
This document captures the hardening backlog based on recent multi-persona adoption evaluations (startup product, platform security, QA determinism, enterprise monorepo, and LLM-heavy org workflows).
Goal: move from "Optional standard" to "Enforce" safely.
How to use this list
- Treat this as a release gate checklist.
- Each item includes an outcome and acceptance criteria.
- Do not mark complete without automated tests and clean-environment evidence.
P0 - Must Fix Before Company Enforcement
1) CLI installation and invocation reliability
Problem
- In local file installs/temp projects, users often could not run
npx apophisdirectly and had to callnode .../dist/cli/index.js.
Required outcome
npx apophisworks predictably for supported package managers and install modes.
Acceptance criteria
- Fresh temp project matrix (
npm,pnpm,yarn,bun) passes:- install local package
npx apophis --helpexits0npx apophis doctorruns successfully
- Packaging test asserts executable bin/shebang correctness and command resolution.
2) Doctor route-discovery consistency with plugin registration
Problem
doctorcan report route-discovery failures (e.g., decorator already added) whileverifyworks, which undermines trust.
Required outcome
doctorreadiness checks are consistent withverifybehavior and avoid false negatives when plugin is already present.
Acceptance criteria
- Fixture matrix for app states:
- plugin pre-registered
- plugin not registered
- duplicate registration attempt
doctoremits accurate status (pass/warnwith remediation), never contradictory hard-fail whenverifysucceeds.
3) First-run contract discoverability and scaffold clarity
Problem
- New users can end up with "No behavioral contracts found" due to missing/unclear contract and plugin wiring expectations.
Required outcome
- First-run path guides users to a successful behavioral check with explicit file names, commands, and expected outputs.
Acceptance criteria
init -> doctor -> verifyin fresh project reaches a known-good contract execution path.- If contracts are missing, message includes exact next steps and sample contract snippet.
- Docs and scaffold output are fully aligned (no conflicting file names/expectations).
4) Replay trustworthiness for failure triage
Problem
- In some scenarios, replay confidence can degrade when nondeterministic app behavior or identity mismatch is involved.
Required outcome
- Replay remains dependable for intended deterministic paths and clearly labels non-repro conditions.
Acceptance criteria
- Failing verify artifact replay reproduces failure for deterministic fixtures.
- For nondeterministic cases, replay explains why reproduction can differ and points to stabilization guidance.
- Qualify and verify artifacts preserve route identity in replay-compatible form.
5) CI truthfulness for real install/runtime parity
Problem
- CI can be green while install/runtime path differences still hurt real users.
Required outcome
- CI includes packaged-distribution smoke checks and fresh-project end-to-end flow.
Acceptance criteria
- CI job runs:
- package build
- temp project install of package artifact/local reference
npx apophis --helpinit -> doctor -> verifyscenario- failure artifact + replay smoke test
P1 - High-Value Hardening for Wide Rollout
6) Determinism guardrails and triage quality
Status: Complete
Required outcome
- Clear separation between deterministic product failures and environment/data nondeterminism.
Acceptance criteria
- Deterministic-mode guidance and flags in docs/output.
- Repeated-run CI test for fixed-seed deterministic fixtures (
verify-ux.test.ts,qualify-signal.test.ts). - Failure text includes nondeterminism guidance when replay diverges.
7) Qualify profile scoping and route control transparency
Status: Complete
Required outcome
- Users can predict and verify route/profile scope from CLI output and artifacts.
Acceptance criteria
- Artifacts include explicit executed route list.
- Artifacts include skipped-route reasons.
- Qualify summary reports per-profile gate execution counts.
- Route/profile filters covered by integration tests.
8) Monorepo operator ergonomics
Status: Complete
Required outcome
- Multi-service operation is straightforward and scriptable.
Acceptance criteria
- Monorepo example/docs show recommended root/workspace scripts.
- Workspace fan-out command paths work without manual dist entrypoint hacks.
- Doctor/verify output is package-attributed and aggregation-friendly.
9) Machine-output scalability and logging ergonomics
Status: Complete
Required outcome
- Machine outputs remain parseable and practical at scale.
Acceptance criteria
- Concise machine summary modes (
json-summary,ndjson-summary) with CI filtering examples. - Documented recommended CI parsers and retention strategy.
- ndjson/json schema stability validated in tests.
P2 - Protocol/RFC Conformance Hardening
10) JWT verification depth and keying policy
Status: Complete
Required outcome
- Strong, test-backed JWT conformance behavior for supported algorithms and key configurations.
Acceptance criteria
- Test vectors for valid/invalid signatures, missing keys, malformed tokens, alg mismatch.
- Clear docs on supported algs, key formats, and verification limits.
Evidence
src/test/protocol-extensions.test.tscovers HS256 valid/invalid, missing key, malformed token, alg mismatch, kid lookup.src/test/cli/protocol-conformance-p2.test.tsadds RS256 and ES256 valid/invalid signature vectors.src/extensions/jwt.tsdocuments supported algorithms:HS256,RS256,ES256.
11) HTTP Signature conformance breadth
Status: Complete
Required outcome
- Explicit signature-input parsing and covered-component behavior for the supported subset.
Acceptance criteria
- Negative corpus tests for malformed signature-input/signature headers.
- Multi-label and covered-component edge-case tests.
- Explicitly documented supported subset and known gaps.
Evidence
src/test/protocol-extensions.test.tscovers parsing, coverage, RSA verification, malformed input (missing label, empty components), bad base64, multi-label headers,@authorityresolution.src/test/cli/protocol-conformance-p2.test.tsadds unsupported algorithm and mismatched label rejection.
12) X.509 and SPIFFE strictness matrix
Status: Complete
Required outcome
- Deterministic and strict identity parsing behavior with clear support boundaries.
Acceptance criteria
- DER/PEM fixture matrix with multiple SAN combinations and malformed certs.
- SPIFFE invalid-case matrix (path, trust domain, dot segments, authority variants).
- Docs align with actual strictness rules and examples.
Evidence
src/test/protocol-extensions.test.tscovers URI SAN extraction, real PEM certificate, malformed PEM rejection, SPIFFE parsing/validation, empty path, dot-segments, invalid trust domain labels, percent-encoded segments, query/fragment rejection, userinfo/port rejection.src/extensions/x509.tsandsrc/extensions/spiffe.tsimplement strict validation rules.
Enforcement Gate Checklist
Before switching company policy to Enforce, all of the following must be true:
- P0 items 1-5 are complete and tested in CI.
- A fresh temp project can run
npx apophis --help,init,doctor,verify, andreplaywithout manual workarounds. - No contradictory
doctorvsverifyreadiness outcomes in supported app patterns. - Failure -> artifact -> replay loop is deterministic on designated deterministic fixtures.
- CI includes packaged/install parity tests, not only in-repo source tests.
- Documentation is aligned with actual behavior and first-run commands.
Suggested ownership split
- CLI/Packaging: items 1, 5
- Doctor/Discovery: item 2
- Onboarding UX/Docs: item 3, 9
- Replay/Determinism: items 4, 6, 7
- Platform/Monorepo: item 8
- Protocol Extensions: items 10, 11, 12