9. Runtime Verification Runbook
Use this runbook after policy/config changes, runtime refactors, or infrastructure moves.
1) Prerequisites
Set:
AGENTID_API_KEYAGENTID_SYSTEM_ID- local API base when testing locally, for example
http://127.0.0.1:3000/api/v1
For production verification you also need:
- a blocking-profile system
- an observe-profile system
- the matching API keys for each system
Optional async forensic audit overrides:
AGENTID_ASYNC_AI_AUDIT_MODELAZURE_OPENAI_ASYNC_AUDIT_DEPLOYMENT_NAMEAGENTID_ASYNC_AI_AUDIT_PROMPT_VERSION
2) Bootstrap First Active Policy Pack
node scripts/qa/bootstrap-policy-pack-and-verify.mjs --base-url=http://127.0.0.1:3000/api/v1 --system-id=<SYSTEM_UUID>
Pass criteria:
policy_pack_artifactscontains an active artifact for the systemai_systems.policy_pack_version > 0- the verification event shows:
metadata_policy_pack_fallback: falsemetadata_policy_pack_version > 0
3) Validate Local Guard + Ingest Lifecycle
powershell -ExecutionPolicy Bypass -File .\scripts\qa\run-guard-diagnostic.ps1 `
-BaseUrl http://127.0.0.1:3000/api/v1 `
-ApiKey $env:AGENTID_API_KEY `
-SystemId $env:AGENTID_SYSTEM_ID `
-SkipBenchmark
Pass criteria:
GET /api/v1/agent/configreturns200- guard and ingest lifecycle tests pass
- policy matrix outcome matches current dashboard toggles
4) Validate Production Matrix
The primary production regression command is:
npm run qa:guard-prod-matrix
This exercises both:
- blocking profile
- observe profile
Pass criteria:
blocking 22/22observe 22/22- no unexpected
401,436, or503
Recommended interpretation:
- steady-state allow path should stay in low hundreds of milliseconds
- observe path should be close to allow-path latency
- if the first blocking request spikes, inspect fallback headers/logs before blaming the matcher
5) Warm Production Runtime
The internal warm endpoint is:
https://app.getagentid.com/api/internal/guard/warm
Use it to prewarm:
- public auth/config lookup
- public preflight guard path
- direct Fly allow path
- direct Fly blocking path
Manual warm call
$headers = @{ Authorization = "Bearer $env:CRON_SECRET" }
Invoke-RestMethod -Method Get -Uri "https://app.getagentid.com/api/internal/guard/warm" -Headers $headers
Cron guidance
- Vercel Cron should target the path
/api/internal/guard/warm - external cron services should call the full URL and send
Authorization: Bearer <CRON_SECRET>
6) Verify Zero-Latency Shadow Mode
Shadow mode should return immediately while still persisting a background audit trail.
Expected response headers:
x-agentid-zero-latency-shadow: 1x-agentid-upstream: deferred_shadow
Expected behavior:
- client receives
allowed: true shadow_mode: true- matching
ai_eventsrow appears later with:guard_upstream_source = "fly"shadow_mode = true
7) Validate Labeling + Async Tier-2 Forensic Audit
powershell -ExecutionPolicy Bypass -File .\scripts\qa\run-ai-label-audit-check.ps1 `
-BaseUrl http://127.0.0.1:3000/api/v1 `
-ApiKey $env:AGENTID_API_KEY `
-SystemId $env:AGENTID_SYSTEM_ID `
-Model gpt-4o
Then in Activity:
- verify expected labels on each test case (
Injection,PII/Data Leak,DB Access,Code Exec) - wait
10-30sand refresh for async forensic audit completion - confirm the detail panel contains the expected auditor fields when
AI analysisis enabled:ai_clean_summaryai_intentai_threat_analysisai_attack_sophisticationai_detected_signalsevaluation_metadata.forensic_audit
Operational interpretation:
- synchronous labels still come from the guard hot path
- async forensic audit can refine generic labels and add secondary signals
- concrete synchronous hard-block classes such as
DB Access,Code Exec, andPII/Data Leakshould remain authoritative
8) Troubleshooting Checklist
Supabase Auth CPU spike / repeated 522 on /auth/v1/token?grant_type=refresh_token
Typical signal:
- Supabase API Gateway shows many
POST /auth/v1/token?grant_type=refresh_token - caller is
Vercel Edge Functions x_client_infoissupabase-ssr/... createServerClient- origin times are tens of seconds and end in
522
What to verify:
- Current deploy includes the middleware auth-refresh stabilization logic.
AGENTID_MIDDLEWARE_SESSION_TIMEOUT_MSstays short (~1200msby default).AGENTID_AUTH_REFRESH_LEEWAY_SECONDSis set so refresh only happens near expiry.AGENTID_AUTH_REFRESH_BACKOFF_SECONDSis enabled so failed refreshes are throttled.- After deploy, middleware metrics show backoff/clearing instead of unbounded refresh attempts:
auth_refresh_attemptauth_refresh_degradedauth_refresh_backoff_activeauth_refresh_cleared_stale_cookies
- Supabase API Gateway refresh-token 522 volume drops sharply within a few minutes of deploy.
policy_pack_fallback=true
- Confirm active artifact exists for the system
- Confirm
ai_systems.policy_pack_versionis non-zero - Rebuild the pack
- Re-run bootstrap verification
No log row after the client used AgentID
This is the most common integration misunderstanding.
Verify these points in order:
guard()alone only creates a preflight row. It does not create the final complete lifecycle row.- Dashboard activity/graphs/cost need
/ingest(directly or through SDK wrapper) after the model response. - JS/Python
wrapOpenAI()currently instrumentchat.completions.create, not arbitrary OpenAI surfaces such asresponses.create. - If the app uses a custom OpenAI helper or background worker, confirm that helper actually calls
agent.log()or a supported wrapper path. - If the request path returns to the client before telemetry is awaited, verify the runtime keeps the background task alive long enough for
/ingestto complete.
9) Benchmark Hot Path
npm run bench:policy-pack-hotpath
This benchmark measures:
- normalization time
- trie prefilter time
- regex evaluation time
- total detection hot-path time
Target:
hot_path_total_ms.p95 < 30
10) Latency SLO Interpretation
Use these two local benchmark profiles when you want detailed diagnostics:
powershell -ExecutionPolicy Bypass -File .\scripts\qa\run-guard-diagnostic.ps1 `
-BaseUrl http://127.0.0.1:3000/api/v1 `
-ApiKey $env:AGENTID_API_KEY `
-SystemId $env:AGENTID_SYSTEM_ID `
-Warmup 4 -Iterations 30 -Parallel 1 -RequestTimeoutSec 15
powershell -ExecutionPolicy Bypass -File .\scripts\qa\run-guard-diagnostic.ps1 `
-BaseUrl http://127.0.0.1:3000/api/v1 `
-ApiKey $env:AGENTID_API_KEY `
-SystemId $env:AGENTID_SYSTEM_ID `
-Warmup 4 -Iterations 40 -Parallel 3 -RequestTimeoutSec 15
Guidance:
- treat
Parallel=1plus DB telemetry as the primary user-facing SLO - treat burst runs as pressure validation, not the main latency contract
- after region moves or proxy changes, rerun both profiles and compare p50/p95 deltas