10. SDK Runtime Integration
What Automatically Creates Dashboard Activity
You only get a full runtime lifecycle in AgentID when both phases happen:
guard -> model execution -> ingest
That means:
guard()creates or updates the preflight side of the lifecyclelog()persists the post-execution telemetry row- SDK wrappers combine both when you use a supported wrapped surface
If the application only calls guard and never calls ingest, Activity, cost, and downstream graphs can look empty or incomplete.
Token, Cost, and ROI Telemetry Is Required
AgentID cost charts and ROI charts are computed from stored runtime telemetry. They are not inferred from the fact that a guard check happened.
For every successful model execution, the final completion event must include:
event_type: "complete"- the real provider
modelid, for examplegpt-4o-mini - token usage from the provider response
- model latency when available
- the same
client_event_id/event_idcorrelation used for the guarded call
Accepted token usage shapes include OpenAI-style and normalized names:
usage: {
prompt_tokens: 33,
completion_tokens: 9,
total_tokens: 42,
}
// or
tokens: {
input: 33,
output: 9,
total: 42,
}
Official wrappers collect this automatically when the provider exposes usage.
Manual integrations must pass it explicitly to agent.log(...) or
/api/v1/ingest. If usage is missing, AgentID can still show Activity rows,
but token counters, cost_usd, Total Spend, cost charts, and ROI graphs will be
empty or understated.
ROI also depends on the system's business context:
human_hourly_ratehuman_time_per_task_min- a priced model id in the AgentID pricing catalog
Projected savings are calculated as human_cost_usd - cost_usd. If provider
usage is present but the model id is unknown to pricing, token counts can appear
while spend and ROI remain N/A.
Activity Rows vs Workflow Timeline
AgentID intentionally keeps prompt/guard telemetry inspectable as its own row, even when that prompt is part of a larger workflow.
For an agent run, expect two complementary views:
- a standalone prompt/guard row with
View DetailsandView Prompt - a workflow summary row with
Open Workflowand the grouped step timeline
Use workflow_run_id / workflowRunId to group tool calls, delivery events,
inbox events, workflow lifecycle rows, guard checks, and LLM calls. Do not reuse
one client_event_id for the whole workflow; each event should keep its own
idempotency key. Non-LLM workflow/tool/delivery rows should show Model: Not applicable and should not be spend-bearing unless model/cost fields are
explicitly present.
Supported Automatic Wrapper Surfaces
Current supported official wrapper surfaces:
- Node.js / TypeScript SDK (
agentid-sdk):wrapOpenAI(...).chat.completions.create(...) - Python SDK:
wrap_openai(...).chat.completions.create(...) - Vercel AI SDK wrapper:
generateText()/streamText()withwithAgentId(...)wrapped models
Current unsupported surfaces unless you add your own integration:
responses.create- Assistants API
- arbitrary custom provider methods
- app-local helper functions that never call
agent.log()
The unsupported list still applies across JS, Python, and Vercel AI SDK flows unless you add an explicit integration path.
Enterprise Rollout Rule
The main rollout risk is usually not classifier quality. It is integration coverage.
If any production path still sends prompt or chat history through an unsupported provider surface, raw OpenAI client, custom fetch, or app-local helper outside a supported wrapper, AgentID does not automatically protect that path.
Before calling a deployment enterprise-ready:
- inventory every production LLM callsite
- map each one to a supported wrapper or an explicit
guard -> provider -> logflow - remove parallel raw provider calls in the same request path
- verify chat routes protect the exact full history that reaches the provider
- verify telemetry reflects that coverage, for example
full_history_protected=trueon wrapped chat flows
Current wrapper scope note:
- OpenAI chat wrappers in this repo currently derive preflight guard input from the last user message text.
- The Vercel AI SDK wrapper currently derives preflight guard input from the last user message text plus supported inline attachments on that last user turn.
- Full-history masking/protection for provider dispatch can still apply on wrapped routes even when preflight scope is narrower.
- If you need preflight evaluation over the exact assembled multi-turn history today, use an explicit
guard -> provider -> logflow for that route.
Pre-LLM Protection Is Not The Same As Masked Logging
This distinction is critical for chat and agent applications.
AgentID protects data before the LLM only when the actual provider call goes
through a supported wrapper or an equivalent explicit guard -> provider -> log
integration.
A common incorrect integration is:
raw user input -> provider / LLM
masked copy -> AgentID log
That can make the dashboard look masked after refresh, but it does not protect the model. The LLM has already seen the raw value.
The correct integration is:
raw user input/history -> AgentID wrapper -> protected input/history -> provider / LLM
-> protected output -> app UI + AgentID ingest
Official Vercel AI SDK wrapper telemetry includes full_history_protected=true
when SDK-side masking is enabled and the wrapper protects the full prompt
history. Manual integrations should set the same signal only after they have
protected every message text part that will be sent to the provider.
For a chat app, protect the complete message history. Do not protect only the latest input field. The provider often receives:
- system prompts
- previous user messages
- previous assistant messages
- tool results
- retrieval/memory context
- the latest user message
If any of those entries contains raw PII or secrets and bypasses the wrapper, the model can still remember or repeat it later.
Example failure signal:
User: Moje jmeno je Jan Kroupa
Assistant after refresh/log view: Moje jmeno je <PERSON_1>
User: jake jmeno jsem poslal v minule zprave?
Assistant: Jan Kroupa
This means the raw name reached the model context. A masked dashboard row alone does not prove pre-LLM protection.
Node.js / TypeScript SDK Semantics
The published agentid-sdk package is designed so that:
agent.guard(...)is synchronous and awaitedagent.log(...)returns a promise and can be awaited directlywrapOpenAI()calls/guardbeforechat.completions.create- on non-streaming completions, the wrapper performs the primary
/ingestwrite before the wrapped call resolves
Practical implication:
If an app logs only create:start / create:ok, that does not prove AgentID ingest happened. It only proves the application believes the wrapped model call succeeded.
You still need to confirm one of these:
- the call was really
secured.chat.completions.create(...) - or the application explicitly awaited
agent.log(...)
Vercel AI SDK Wrapper Semantics
The dedicated agentid-vercel-sdk package is designed so that:
withAgentId(...)wraps an existing Vercel AI SDK modelgenerateText()andstreamText()stay unchanged at the application callsite- the wrapper calls
/guardbefore the provider request - denied prompts throw
AgentIdSecurityErrorbefore the provider is billed - allowed prompts can be rewritten from
transformed_inputbefore execution - completion telemetry is written through
/ingest sdk_ingest_msis finalized through/ingest/finalize- SDK-side masking, when enabled by runtime config, rewrites sensitive input before provider dispatch and rewrites sensitive output before it is returned to the app caller
Safe Vercel AI SDK pattern:
import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
import { withAgentId } from "agentid-vercel-sdk";
const secureModel = withAgentId(openai("gpt-4o"), {
systemId: process.env.AGENTID_SYSTEM_ID!,
apiKey: process.env.AGENTID_API_KEY!,
});
const result = streamText({
model: secureModel,
messages: fullConversationHistory,
});
return result.toTextStreamResponse();
Unsafe pattern:
// This bypasses AgentID for the real LLM call.
const result = streamText({
model: openai("gpt-4o"),
messages: rawConversationHistory,
});
// Logging a masked copy later does not undo the provider leak.
await agent.log({ input: maskedInput, output: maskedOutput, system_id: systemId });
For AI coding agents implementing an app, the checklist is:
- Locate the exact server route/action/function that calls the LLM.
- Wrap the model/client at that callsite.
- Pass the wrapped model/client into the real provider call.
- Route the full
messagesarray through that wrapped call. - Remove direct provider calls in the same request path.
- Render the wrapped response/stream, not a raw provider stream.
- Preserve the provider
usageobject on completion telemetry so token, cost, and ROI dashboards populate. - Add a test where a previous message contains a name and a later message asks the model to repeat it. The answer must not contain the raw name.
- Add a telemetry test or dashboard check that the completion row has
input_tokens,output_tokens, andcost_usdwhen the model is priced. - Confirm the Activity detail shows
full_history_protected=truefor chat integrations with multi-message context.
Streaming behavior:
- the user-visible stream is not blocked by post-flight telemetry
- the wrapper observes a forked stream branch and finalizes telemetry after the stream completes
Current provider coverage in this repo:
@ai-sdk/openainon-stream@ai-sdk/openaistream@ai-sdk/anthropicnon-stream@ai-sdk/anthropicstream
Browser Extension Masking Semantics
The browser extension is not an SDK integration. It reports browser-origin capabilities and relies on the backend to protect stored logs when client-side masking is not applied.
In Activity detail:
Server maskingmeans AgentID masked PII or secrets before storage- it is expected for browser extension flows to show SDK masking as
-orno - this should not be labelled as SDK fallback
Input transformed before storage: yesindicates stored prompt/log content was protected
Use this distinction when debugging extension events: SDK-side masking applies
to agentid-sdk, agentid-vercel-sdk, Python SDK, and LangChain wrappers;
browser-extension masking labels describe the extension/backend storage path.
Common Reasons No AgentID Event Appears
1) The app used an unsupported provider surface
Example:
await client.responses.create(...)
If only chat.completions.create is wrapped, this path bypasses AgentID wrapper telemetry.
For Vercel AI SDK apps, the equivalent mistake is calling an unwrapped model directly instead of the result of withAgentId(...).
2) The app only called guard
Guard can allow the prompt and still produce no final complete row if the post-model ingest step never runs.
3) The app logged completion without token usage
Activity can show the request while cost and ROI stay empty if the completion
event omits usage / tokens, uses Model: Not applicable, or uses a custom
model id that is not mapped to pricing.
4) The app returned HTTP 200 before its own background telemetry completed
If the application starts background work after sending the response, the worker/runtime may drop the ingest request depending on the framework and hosting model.
5) The wrong system or key was used
Always confirm:
AGENTID_API_KEYAGENTID_SYSTEM_IDbaseUrl
match the target AgentID environment.
Recommended Verification Pattern
When debugging a client integration, verify in this order:
GET /api/v1/agent/configreturns200POST /api/v1/guardreturns200or403with the expectedclient_event_idPOST /api/v1/ingestreturns200withsuccess: true- the same
client_event_idappears inai_events
If step 2 works but steps 3 and 4 do not, the bug is in the post-model telemetry path, not the guard engine.
Minimal Explicit Integration Pattern
If you are not using a supported automatic wrapper surface, do it explicitly:
1. assemble the full prompt/message history
2. await agent.guard(...) on that full input
3. if denied, stop before calling the provider
4. if allowed/masked, call the provider with the protected input/history
5. mask/protect provider output before returning it to the user
6. read model, usage tokens, and latency from the provider response
7. await agent.log(...) with the protected input/output plus model, usage, and latency
8. include `full_history_protected=true` metadata only if every message in the
provider payload was protected
This is the most reliable integration pattern for custom app architectures.
Since agentid-sdk@0.1.40, fail-open dependency fallback keeps deterministic
local PII and secret masking active when /agent/config or /guard is
unreachable. Fail-open can preserve availability, but it must not be interpreted
as permission to send raw sensitive text to the model provider.
For Node/OpenAI manual routes, use the SDK helper on the same messages object
that will be sent to the provider:
The important integration detail is the unit of work: one provider call should
produce one protected history and one guard() call. Do not iterate over old
messages and emit one guard event per prior turn.
import { AgentID, protectMessageHistory } from "agentid-sdk";
const agent = new AgentID();
const protectedHistory = protectMessageHistory(body.messages, {
pii: true,
secrets: true,
});
const verdict = await agent.guard({
system_id: process.env.AGENTID_SYSTEM_ID!,
input: extractLatestUserInput(protectedHistory.messages),
model: "gpt-4o-mini",
metadata: {
runtime_surface: "manual_provider_integration",
full_history_protected: true,
messages_count: body.messages.length,
protected_messages_count: protectedHistory.messages.length,
prompt_text_parts_count: protectedHistory.textPartsCount,
transformed_prompt_text_parts_count:
protectedHistory.transformedTextPartsCount,
},
});
if (!verdict.allowed) throw new Error(`Blocked: ${verdict.reason}`);
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: protectedHistory.messages,
});
If the application already uses Vercel AI SDK and does not need custom manual orchestration, prefer agentid-vercel-sdk instead of rebuilding this lifecycle by hand.
Release Verification
Before deploying or publishing SDK packages from the monorepo, run:
npm run audit:all
npm run qa:production-gate
audit:all checks the root app, agentid-sdk, packages/vercel-sdk, and
packages/browser-extension. The production gate then runs secret scanning,
audits, lint, typecheck, unit tests, SDK builds/tests, browser extension
tests/typecheck/review build, and the Next production build.