Tools

A tool is an atomic action the LLM can take during a turn: search memory, send a Slack message, create a HubSpot contact, schedule a future task, spawn a sub-agent. Tools have typed inputs (validated by Zod), typed outputs, a stable name, and a description the LLM reads when deciding whether to call them. Polyant tools self-register at boot — adding a new tool is one file.

This page covers the registry, the ToolContext passed to every tool, the strict-mode constraints on Zod schemas, the per-instance enablement model, harness tools, and the conventions for shipping a reliable tool.

Self-registering registry

Every tool lives in packages/engine/src/agents/tools/ as a file ending in *.tool.ts. At boot, loadAllTools() scans the directory and dynamic-imports every match. Each file calls registerTool(...) at module level as a side effect, populating an in-memory registry keyed by tool name.

The supervisor queries the registry at request time, filters it by what the active instance has enabled (more on that below), and hands the filtered set to the Vercel AI SDK as a Record<string, Tool>. There are no hardcoded imports, no central tool list to keep in sync.

After all imports complete, the registry prunes any tool whose requiredEnv vars (legacy, process-env-based) are missing, logging a warning. Most tools today rely on per-instance secrets instead — see below.

Tool definition API


registerTool({
  name: "myTool",
  description: "What this tool does, in one sentence. The LLM reads this.",
  category: "integration",
  requiredSecrets: [
    { key: "openai_api_key", type: "text", label: "OpenAI API key", optional: false },
  ],
  inputExamples: [
    { label: "Common use case", input: { foo: "bar" } },
  ],
  create: (ctx) => ({
    parameters: z.object({ foo: z.string() }),
    execute: async (params) => {
      try {
        const result = await doTheThing(params.foo, ctx.secrets?.openai_api_key);
        return { success: true, result };
      } catch (err) {
        return { success: false, error: errMsg(err) };
      }
    },
  }),
});

The key field is create. It is a factory that runs per request: it receives a fresh ToolContext and returns the Zod schema plus the execute function bound to that context. This is how the same tool talks to instance A’s HubSpot portal and instance B’s HubSpot portal in parallel without leaking state.

ToolContext

Every tool factory receives a ToolContext carrying everything the tool needs to act on behalf of the current turn:

instanceId — the instance slug (not a UUID). Most tool-level operations key off the slug; the few places that need the UUID resolve it via resolveInstanceId().
secrets — decrypted per-instance secrets and any tool-declared requiredSecrets values.
audit — a scoped AuditLogger that already carries the tool name, instance, and conversation context.
conversationId — for correlation in audit logs and tool results.
attachments — images, files, audio carried by the current user message.
apiKeys — per-instance AI provider keys, used by tools that themselves call the LLM (e.g. verifyDocument).
provider — the instance’s AI provider name (openai, anthropic, bedrock) for tool-level LLM calls.

The audit logger writes one row per tool call into a dedicated audit table. Sensitive arguments (Slack message bodies, secret values) are deliberately not logged — only the field’s length or a redacted marker.

Strict-mode Zod constraints

To keep tool schemas portable across OpenAI and Anthropic function-calling formats, Polyant enforces a small set of Zod conventions. There is no separate strict-mode.ts module — the contract is codified entirely in the test suite at packages/engine/src/agents/tools/strict-mode.test.ts, which iterates over every registered tool and fails the build on any violation. The forbidden patterns are:

No .url() — many models reject the JSON Schema format: "uri" annotation. Use plain z.string() with documentation.
No .optional() without .nullable() — optional-but-not-nullable produces inconsistent behaviour across providers. Use z.string().nullable() or z.string().nullish().
No z.record(z.unknown()) — open-ended records confuse function-calling. Pin the shape (z.record(z.string(), z.string())) or use a typed object.

Because the check lives in CI, schema violations surface in dev/CI before reaching production.

Per-instance enablement

The registry holds every tool the framework knows about. What an individual instance can actually use is gated by the instance_tools table — one row per (instance_id, tool_name) with an enabled flag.

instance_tools is auto-synchronised:

On boot, every registered tool is upserted into the tools catalogue table.
When a skill is enabled on an instance, the skill’s requiredTools are auto-enabled on instance_tools.
When a skill is disabled, the dependency is recomputed (no orphan enablement).
Operators can also toggle tools directly from the admin panel.

The supervisor’s buildTools pulls the enabled set via getEnabledToolNames() and only constructs Vercel AI SDK tool({...}) instances for those names. Tools that depend on missing secrets are surfaced in the admin panel as “misconfigured” so operators can spot the gap.

`syncToolsToDb` and the orphan cascade

On every boot, syncToolsToDb() (in tools-sync.ts) is the source of truth: it reconciles the tools catalogue table against whatever the registry just loaded from disk. Tools present in the DB but not in the registry (e.g. a *.tool.ts file you deleted, renamed, or refactored away) are deleted from tools. A foreign-key cascade on instance_tools(tool_name) then sweeps every per-instance enablement row referencing the now-gone tool. The net effect: rename a tool on disk, restart the engine, and every instance that had the old name enabled will silently lose that row. Re-enable from the admin UI under the new name — there is no automated migration.

Conditional visibility — what an agent actually sees on a given turn

The registry holds every tool, but the set actually handed to the LLM on a specific turn is filtered through five layers. A tool can disappear from a turn without ever being “disabled” in the admin UI.

1. Per-agent enablement (`instance_tools` table)

The standard gate. getEnabledToolNames(instanceId) returns the union of tools toggled on for this agent (manually or via skill requiredTools). Tools not in the set are skipped — except harness tools (see below), which bypass this layer entirely.

2. Feature flags on the agent (`memoryEnabled`, `knowledgeEnabled`)

Two flags on the instances row act as category-level kill-switches in the supervisor:

memoryEnabled = false → tools with category: "memory" (searchMemory, saveMemory) are skipped.
knowledgeEnabled = false → tools with category: "knowledge" (searchKnowledge, getKnowledge, writeKnowledge) are skipped.

The flags trump per-agent enablement: even if instance_tools has the row, the tool is dropped when the flag is off. This is what lets you offer a fast, memory-less agent variant by flipping one switch.

3. Missing `requiredSecrets`

Tools that declare requiredSecrets with optional: false are silently skipped whenever any of those secrets is absent from instance_secrets. Examples:

searchKnowledge, searchMemory, saveMemory → need openai_api_key (embeddings).
HubSpot tools (hubspotContact, hubspotTicket, …) → need hubspot_api_key.
GitHub tools (ghIssue, gitCloneRepo) → need github_token.
webSearch → needs tavily_api_key (or another configured provider key).
uploadAttachment → needs aws_access_key_id, aws_secret_access_key, aws_region, s3_bucket_name.

The skip is silent by design: the LLM never sees the tool, so it cannot try to call it and fail. Operators see a “misconfigured” badge in the admin panel where the secret is missing.

4. Harness tools — only injected in specific runtimes

Some tools are deliberately not user-toggled. They are wired into the framework for a specific runtime path and would be meaningless (or harmful) anywhere else. They declare harness: true plus a category and are:

Hidden from the admin panel’s tool list.
Skipped by the instance_tools enablement gate.
Injected by the supervisor only when it runs with a matching includeHarness: Set<string>.

The supervisor’s caller decides which harness categories to admit. Today’s runtimes:

Runtime	`includeHarness` value	Tools injected
Inbound (chat)	(none)	none — chat turns get only the standard catalogue
Scheduled task	(none)	none — same pipeline as Inbound
Sub-agent via `spawnTask`	(none, parent’s tools minus `spawnTask`)	none — sub-agents inherit parent tools, harness excluded
Room cycle	`{ "room" }`	`send_message_to_human`, `mark_room_event_completed`, `compact_room_history`
Webhook trigger	`{ <outboundChannel> }` (e.g. `"whatsapp"`)	`send_outbound_message` (always when triggered), plus channel-specific tools like `send_whatsapp_template` when the trigger’s `outboundChannel` matches

The categories are not interchangeable: a Room turn cannot send a WhatsApp template, and a webhook trigger cannot mark Room events completed. Each runtime carries the exact subset it needs.

spawnTask itself is a special case — registered with category: "agent" and built separately by createTaskTool() so it can be passed the parent’s tool list minus itself (which prevents recursion). It is exposed to the supervisor by default but never propagated to sub-agents.

5. Agent-to-agent tools — synthesised on the fly

When an instance_tools row has a name of the form agent:<targetSlug>, the supervisor synthesises a one-off tool at request time that lets this agent dispatch a task to the target agent. The dynamic tool is built only if the target agent has its agent channel enabled, and nesting is capped at depth 1 to keep loops bounded.

These tools exist nowhere on disk — they are pure synthesis driven by per-agent configuration.

Putting it together

The order of evaluation in buildTools() is roughly:


for every tool in registry:
  if tool.harness:
    keep iff tool.category ∈ includeHarness
  else:
    skip iff not enabled in instance_tools
    skip iff memoryEnabled=false and category="memory"
    skip iff knowledgeEnabled=false and category="knowledge"
    skip iff any non-optional requiredSecret is missing
then:
  add spawnTask (if available, parent context)
  add agent:<slug> tools (synthesised from instance_tools rows)

Net effect: the set of tools the LLM sees is a function of the agent’s row in instances, its instance_tools rows, its instance_secrets rows, and the runtime path. The same agent on the same DB exposes a different toolset to chat vs Room vs a WhatsApp webhook trigger.

Input examples

Every tool can declare inputExamples — short labelled snippets that the supervisor inlines next to the Zod schema description. Models pick up tool usage faster when they see a concrete example than from a schema alone. Each example is validated at boot against parameters.partial() so a stale example cannot ship.

Keep example labels to a single sentence; keep the input objects minimal.

Execute contract: never throw

Tool execute functions follow a strict convention:


return { success: true, ...payload };
// or
return { success: false, error: "human-readable explanation" };

They never throw. A throw bubbles into the supervisor as a runtime crash and tends to wedge the tool loop in retry storms. Catching at the tool boundary and returning a structured failure lets the LLM read the error, recover, and either try a different tool or surface a clean message to the user.

Execution isolation: tools that touch the filesystem or spawn subprocesses

A handful of tools do not just call HTTP APIs — they read or write files, or they spawn external processes (git, the GitHub gh CLI). These need stricter guardrails than a normal API-call tool, because a misstep can clobber the host filesystem, leak secrets through the process environment, or hold a token at rest. Polyant sandboxes them along three orthogonal axes.

Per-conversation workspace, never the host filesystem

Filesystem-touching tools (readFile, writeFile, listDirectory, gitCloneRepo) operate inside a per-conversation workspace:


{WORKSPACES_ROOT}/{instanceId}/conversations/{sanitizedConversationId}/

WORKSPACES_ROOT defaults to packages/engine/workspaces/ and is overridable via env var. The instance slug is validated against ^[a-z0-9][a-z0-9-]*$. The conversation id is sanitised — any character outside [a-zA-Z0-9._-] is replaced with _ — so a Slack id like slack:channel:U123 becomes slack_channel_U123 and never appears literally in a path.

Every path the tool resolves passes through workspace-utils.ts:

resolveWorkspacePath() rejects null bytes, absolute paths, and ../ traversal that would escape the workspace.
After string normalisation it walks the path to the first existing ancestor and calls realpath() on it, then checks containment again — catching symlinks that would otherwise re-route the path outside the sandbox.
The same two checks gate absolute paths returned by gitCloneRepo and later handed to readFile (assertInsideConversationWorkspace).

Net effect: a malicious or confused agent cannot read /etc/passwd, write into a sibling conversation’s directory, or follow a symlink to escape its own workspace.

Subprocess environment is filtered to a whitelist

The engine runs with the deployment’s full environment — DATABASE_URL, ENCRYPTION_KEY, AUTH_SECRET, every provider API key. None of that must leak into a child git or gh process. The safeEnv() helper builds a fresh env from a small whitelist (PATH, HOME, USER, SHELL, TMPDIR, TERM, locale, Node CA-cert overrides, and HTTP-proxy vars) and lets the caller add only the tokens it actually needs:


execFile("git", args, {
  env: safeEnv({
    GIT_ASKPASS: askPassPath,
    GIT_TERMINAL_PROMPT: "0",
    OA_GITHUB_TOKEN: token,   // GitHub token, nothing else
  }),
})

DATABASE_URL, ENCRYPTION_KEY, AUTH_SECRET, and AI provider keys are never propagated to subprocesses.

`gitCloneRepo` — token lifecycle and stale cleanup

gitCloneRepo is the most invasive tool: it spawns git, writes a credential helper to disk, and leaves a token file behind so later git operations (push, fetch, follow-up clones) can authenticate without round-tripping through the engine. The lifecycle is explicit:

File	Path	Mode	Lifetime
Ephemeral askpass shim	`mkdtempSync()` temp dir	`0o700`	Deleted in `finally` immediately after the clone returns
Token file (at rest)	`<repo>/.git/polyant-token`	`0o600`	Lives as long as the cloned repo
Persistent helper	`<repo>/.git/polyant-askpass.sh`	`0o700`	Lives as long as the cloned repo

Repos themselves land in <conversationWorkspace>/.repos/<owner>/<repo>-<8charSuffix>/. The random suffix lets the same conversation clone the same repo more than once without name collisions.

Cleanup runs on two clocks:

Per-clone, at the start of every gitCloneRepo call, cleanupStaleRepos() removes any directory older than 2 hours in the current conversation’s .repos/. If it finds a leftover polyant-token, it logs a warning — that signals a prior engine crash mid-operation.
Per-instance, on instance deletion, the whole instance workspace tree is removed.

There is no per-conversation cleanup endpoint today; long-lived conversations with many clones accumulate .repos/ entries until they age past the 2-hour stale threshold or the instance is deleted.

Subprocess timeouts and buffer caps

Every execFile call is bounded:

git operations: 120 s timeout, 5 MB combined stdout/stderr buffer.
gh CLI operations: 30 s timeout, 5 MB buffer.

A timeout returns a structured { success: false, error: "..." } to the LLM — the supervisor sees the failure as a normal tool error and decides whether to retry or report it upstream.

What is not sandboxed

Be honest about the threat model:

No network egress filtering. A cloned repo’s post-checkout hook, or a gh subcommand, can reach any host the engine itself can reach.
No CPU / RAM limits. A pathological repository or a runaway git pack will consume whatever the OS lets it. In containerised deployments, the container’s resource limits are the only backstop.
No syscall sandbox. There is no seccomp / AppArmor / SELinux profile. The subprocess runs as the engine’s OS user.

The trust model is: the operator controls which agents are deployed and which secrets they hold; the agent’s prompt and tool gating control which repos can be cloned; the filesystem boundary and environment whitelist prevent collateral damage within that trust envelope. Tighter sandboxing (gVisor, ephemeral containers per tool call) is the natural next step for deployments that need to run code from untrusted sources.

How it works


boot:
  loadAllTools()
    readdir(*.tool.ts)
       |
       v
    dynamic import each --> registerTool(...) side effect
       |
       v
    prune missing requiredEnv
       |
       v
    syncToolsToDb()  -- upserts catalogue + reconciles instance_tools


per request:
                +-------------------------+
   user msg --> | Supervisor.buildTools() |
                +-----------+-------------+
                            |
                            v
                getEnabledToolNames(instanceId)
                            |
                            v
   for each enabled tool:
       def = registry.get(name)
       inst = def.create(ToolContext { instanceId, secrets,
                                       audit, conversationId,
                                       attachments, apiKeys,
                                       provider })
       tool({ description: def.description,
              parameters: inst.parameters,
              execute: wrap(inst.execute) })  // logs + audit
                            |
                            v
                +-------------------------+
                | Vercel AI SDK tool loop |
                | (maxSteps = 15)         |
                +-------------------------+

Code reference

packages/engine/src/agents/tools/registry.ts — registerTool, ToolDefinition, ToolContext, loadAllTools, listAvailableTools.
packages/engine/src/agents/tools/strict-mode.test.ts — Enforced Zod constraints (no .url(), no .optional() without .nullable(), no z.record(z.unknown())).
packages/engine/src/agents/tools/tools-sync.ts — syncToolsToDb() and the instance_tools reconciliation logic.
packages/engine/src/agents/tools/tools.schema.ts — Catalogue table.
packages/engine/src/agents/supervisor/index.ts — buildTools() consumer of the registry.
packages/engine/src/audit/audit-logger.ts — The scoped audit logger passed via ToolContext.
packages/engine/src/agents/tools/shared/workspace-utils.ts — resolveWorkspacePath, assertInsideConversationWorkspace, sanitizeConversationId, symlink containment checks.
packages/engine/src/agents/tools/safe-env.ts — Subprocess env whitelist (safeEnv).
packages/engine/src/agents/tools/git-clone-repo.tool.ts — Clone flow, askpass + token-file lifecycle, cleanupStaleRepos (2 h threshold).
Sample tools: search-memory.tool.ts, web-search.tool.ts, http-request.tool.ts, slack-post-message.tool.ts, hubspot-contact.tool.ts, schedule-task.tool.ts, read-skill.tool.ts, task-tool.ts (spawnTask), room-mark-completed.tool.ts (harness).