Skip to main content

10. SDK Runtime Integration

What Automatically Creates Dashboard Activity

You only get a full runtime lifecycle in AgentID when both phases happen:

guard -> model execution -> ingest

That means:

  • guard() creates or updates the preflight side of the lifecycle
  • log() persists the post-execution telemetry row
  • SDK wrappers combine both when you use a supported wrapped surface

If the application only calls guard and never calls ingest, Activity, cost, and downstream graphs can look empty or incomplete.

Token, Cost, and ROI Telemetry Is Required

AgentID cost charts and ROI charts are computed from stored runtime telemetry. They are not inferred from the fact that a guard check happened.

For every successful model execution, the final completion event must include:

  • event_type: "complete"
  • the real provider model id, for example gpt-4o-mini
  • token usage from the provider response
  • model latency when available
  • the same client_event_id / event_id correlation used for the guarded call

Accepted token usage shapes include OpenAI-style and normalized names:

usage: {
prompt_tokens: 33,
completion_tokens: 9,
total_tokens: 42,
}

// or
tokens: {
input: 33,
output: 9,
total: 42,
}

Official wrappers collect this automatically when the provider exposes usage. Manual integrations must pass it explicitly to agent.log(...) or /api/v1/ingest. If usage is missing, AgentID can still show Activity rows, but token counters, cost_usd, Total Spend, cost charts, and ROI graphs will be empty or understated.

ROI also depends on the system's business context:

  • human_hourly_rate
  • human_time_per_task_min
  • a priced model id in the AgentID pricing catalog

Projected savings are calculated as human_cost_usd - cost_usd. If provider usage is present but the model id is unknown to pricing, token counts can appear while spend and ROI remain N/A.

Activity Rows vs Workflow Timeline

AgentID intentionally keeps prompt/guard telemetry inspectable as its own row, even when that prompt is part of a larger workflow.

For an agent run, expect two complementary views:

  • a standalone prompt/guard row with View Details and View Prompt
  • a workflow summary row with Open Workflow and the grouped step timeline

Use workflow_run_id / workflowRunId to group tool calls, delivery events, inbox events, workflow lifecycle rows, guard checks, and LLM calls. Do not reuse one client_event_id for the whole workflow; each event should keep its own idempotency key. Non-LLM workflow/tool/delivery rows should show Model: Not applicable and should not be spend-bearing unless model/cost fields are explicitly present.

Supported Automatic Wrapper Surfaces

Current supported official wrapper surfaces:

  • Node.js / TypeScript SDK (agentid-sdk): wrapOpenAI(...).chat.completions.create(...)
  • Python SDK: wrap_openai(...).chat.completions.create(...)
  • Vercel AI SDK wrapper: generateText() / streamText() with withAgentId(...) wrapped models

Current unsupported surfaces unless you add your own integration:

  • responses.create
  • Assistants API
  • arbitrary custom provider methods
  • app-local helper functions that never call agent.log()

The unsupported list still applies across JS, Python, and Vercel AI SDK flows unless you add an explicit integration path.

Enterprise Rollout Rule

The main rollout risk is usually not classifier quality. It is integration coverage.

If any production path still sends prompt or chat history through an unsupported provider surface, raw OpenAI client, custom fetch, or app-local helper outside a supported wrapper, AgentID does not automatically protect that path.

Before calling a deployment enterprise-ready:

  1. inventory every production LLM callsite
  2. map each one to a supported wrapper or an explicit guard -> provider -> log flow
  3. remove parallel raw provider calls in the same request path
  4. verify chat routes protect the exact full history that reaches the provider
  5. verify telemetry reflects that coverage, for example full_history_protected=true on wrapped chat flows

Current wrapper scope note:

  • OpenAI chat wrappers in this repo currently derive preflight guard input from the last user message text.
  • The Vercel AI SDK wrapper currently derives preflight guard input from the last user message text plus supported inline attachments on that last user turn.
  • Full-history masking/protection for provider dispatch can still apply on wrapped routes even when preflight scope is narrower.
  • If you need preflight evaluation over the exact assembled multi-turn history today, use an explicit guard -> provider -> log flow for that route.

Pre-LLM Protection Is Not The Same As Masked Logging

This distinction is critical for chat and agent applications.

AgentID protects data before the LLM only when the actual provider call goes through a supported wrapper or an equivalent explicit guard -> provider -> log integration.

A common incorrect integration is:

raw user input -> provider / LLM
masked copy -> AgentID log

That can make the dashboard look masked after refresh, but it does not protect the model. The LLM has already seen the raw value.

The correct integration is:

raw user input/history -> AgentID wrapper -> protected input/history -> provider / LLM
-> protected output -> app UI + AgentID ingest

Official Vercel AI SDK wrapper telemetry includes full_history_protected=true when SDK-side masking is enabled and the wrapper protects the full prompt history. Manual integrations should set the same signal only after they have protected every message text part that will be sent to the provider.

For a chat app, protect the complete message history. Do not protect only the latest input field. The provider often receives:

  • system prompts
  • previous user messages
  • previous assistant messages
  • tool results
  • retrieval/memory context
  • the latest user message

If any of those entries contains raw PII or secrets and bypasses the wrapper, the model can still remember or repeat it later.

Example failure signal:

User: Moje jmeno je Jan Kroupa
Assistant after refresh/log view: Moje jmeno je <PERSON_1>
User: jake jmeno jsem poslal v minule zprave?
Assistant: Jan Kroupa

This means the raw name reached the model context. A masked dashboard row alone does not prove pre-LLM protection.

Node.js / TypeScript SDK Semantics

The published agentid-sdk package is designed so that:

  • agent.guard(...) is synchronous and awaited
  • agent.log(...) returns a promise and can be awaited directly
  • wrapOpenAI() calls /guard before chat.completions.create
  • on non-streaming completions, the wrapper performs the primary /ingest write before the wrapped call resolves

Practical implication:

If an app logs only create:start / create:ok, that does not prove AgentID ingest happened. It only proves the application believes the wrapped model call succeeded.

You still need to confirm one of these:

  • the call was really secured.chat.completions.create(...)
  • or the application explicitly awaited agent.log(...)

Vercel AI SDK Wrapper Semantics

The dedicated agentid-vercel-sdk package is designed so that:

  • withAgentId(...) wraps an existing Vercel AI SDK model
  • generateText() and streamText() stay unchanged at the application callsite
  • the wrapper calls /guard before the provider request
  • denied prompts throw AgentIdSecurityError before the provider is billed
  • allowed prompts can be rewritten from transformed_input before execution
  • completion telemetry is written through /ingest
  • sdk_ingest_ms is finalized through /ingest/finalize
  • SDK-side masking, when enabled by runtime config, rewrites sensitive input before provider dispatch and rewrites sensitive output before it is returned to the app caller

Safe Vercel AI SDK pattern:

import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
import { withAgentId } from "agentid-vercel-sdk";

const secureModel = withAgentId(openai("gpt-4o"), {
systemId: process.env.AGENTID_SYSTEM_ID!,
apiKey: process.env.AGENTID_API_KEY!,
});

const result = streamText({
model: secureModel,
messages: fullConversationHistory,
});

return result.toTextStreamResponse();

Unsafe pattern:

// This bypasses AgentID for the real LLM call.
const result = streamText({
model: openai("gpt-4o"),
messages: rawConversationHistory,
});

// Logging a masked copy later does not undo the provider leak.
await agent.log({ input: maskedInput, output: maskedOutput, system_id: systemId });

For AI coding agents implementing an app, the checklist is:

  1. Locate the exact server route/action/function that calls the LLM.
  2. Wrap the model/client at that callsite.
  3. Pass the wrapped model/client into the real provider call.
  4. Route the full messages array through that wrapped call.
  5. Remove direct provider calls in the same request path.
  6. Render the wrapped response/stream, not a raw provider stream.
  7. Preserve the provider usage object on completion telemetry so token, cost, and ROI dashboards populate.
  8. Add a test where a previous message contains a name and a later message asks the model to repeat it. The answer must not contain the raw name.
  9. Add a telemetry test or dashboard check that the completion row has input_tokens, output_tokens, and cost_usd when the model is priced.
  10. Confirm the Activity detail shows full_history_protected=true for chat integrations with multi-message context.

Streaming behavior:

  • the user-visible stream is not blocked by post-flight telemetry
  • the wrapper observes a forked stream branch and finalizes telemetry after the stream completes

Current provider coverage in this repo:

  • @ai-sdk/openai non-stream
  • @ai-sdk/openai stream
  • @ai-sdk/anthropic non-stream
  • @ai-sdk/anthropic stream

Browser Extension Masking Semantics

The browser extension is not an SDK integration. It reports browser-origin capabilities and relies on the backend to protect stored logs when client-side masking is not applied.

In Activity detail:

  • Server masking means AgentID masked PII or secrets before storage
  • it is expected for browser extension flows to show SDK masking as - or no
  • this should not be labelled as SDK fallback
  • Input transformed before storage: yes indicates stored prompt/log content was protected

Use this distinction when debugging extension events: SDK-side masking applies to agentid-sdk, agentid-vercel-sdk, Python SDK, and LangChain wrappers; browser-extension masking labels describe the extension/backend storage path.

Common Reasons No AgentID Event Appears

1) The app used an unsupported provider surface

Example:

await client.responses.create(...)

If only chat.completions.create is wrapped, this path bypasses AgentID wrapper telemetry.

For Vercel AI SDK apps, the equivalent mistake is calling an unwrapped model directly instead of the result of withAgentId(...).

2) The app only called guard

Guard can allow the prompt and still produce no final complete row if the post-model ingest step never runs.

3) The app logged completion without token usage

Activity can show the request while cost and ROI stay empty if the completion event omits usage / tokens, uses Model: Not applicable, or uses a custom model id that is not mapped to pricing.

4) The app returned HTTP 200 before its own background telemetry completed

If the application starts background work after sending the response, the worker/runtime may drop the ingest request depending on the framework and hosting model.

5) The wrong system or key was used

Always confirm:

  • AGENTID_API_KEY
  • AGENTID_SYSTEM_ID
  • baseUrl

match the target AgentID environment.

When debugging a client integration, verify in this order:

  1. GET /api/v1/agent/config returns 200
  2. POST /api/v1/guard returns 200 or 403 with the expected client_event_id
  3. POST /api/v1/ingest returns 200 with success: true
  4. the same client_event_id appears in ai_events

If step 2 works but steps 3 and 4 do not, the bug is in the post-model telemetry path, not the guard engine.

Minimal Explicit Integration Pattern

If you are not using a supported automatic wrapper surface, do it explicitly:

1. assemble the full prompt/message history
2. await agent.guard(...) on that full input
3. if denied, stop before calling the provider
4. if allowed/masked, call the provider with the protected input/history
5. mask/protect provider output before returning it to the user
6. read model, usage tokens, and latency from the provider response
7. await agent.log(...) with the protected input/output plus model, usage, and latency
8. include `full_history_protected=true` metadata only if every message in the
provider payload was protected

This is the most reliable integration pattern for custom app architectures.

Since agentid-sdk@0.1.40, fail-open dependency fallback keeps deterministic local PII and secret masking active when /agent/config or /guard is unreachable. Fail-open can preserve availability, but it must not be interpreted as permission to send raw sensitive text to the model provider.

For Node/OpenAI manual routes, use the SDK helper on the same messages object that will be sent to the provider:

The important integration detail is the unit of work: one provider call should produce one protected history and one guard() call. Do not iterate over old messages and emit one guard event per prior turn.

import { AgentID, protectMessageHistory } from "agentid-sdk";

const agent = new AgentID();
const protectedHistory = protectMessageHistory(body.messages, {
pii: true,
secrets: true,
});

const verdict = await agent.guard({
system_id: process.env.AGENTID_SYSTEM_ID!,
input: extractLatestUserInput(protectedHistory.messages),
model: "gpt-4o-mini",
metadata: {
runtime_surface: "manual_provider_integration",
full_history_protected: true,
messages_count: body.messages.length,
protected_messages_count: protectedHistory.messages.length,
prompt_text_parts_count: protectedHistory.textPartsCount,
transformed_prompt_text_parts_count:
protectedHistory.transformedTextPartsCount,
},
});
if (!verdict.allowed) throw new Error(`Blocked: ${verdict.reason}`);

const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: protectedHistory.messages,
});

If the application already uses Vercel AI SDK and does not need custom manual orchestration, prefer agentid-vercel-sdk instead of rebuilding this lifecycle by hand.

Release Verification

Before deploying or publishing SDK packages from the monorepo, run:

npm run audit:all
npm run qa:production-gate

audit:all checks the root app, agentid-sdk, packages/vercel-sdk, and packages/browser-extension. The production gate then runs secret scanning, audits, lint, typecheck, unit tests, SDK builds/tests, browser extension tests/typecheck/review build, and the Next production build.