Glossary

This document defines the core concepts you’ll encounter when working with Polyant.

Instance

A self-contained AI assistant configuration. Each instance has its own:

Prompts (8 sections: identity, soul, tooling, safety, skills, memory, user-identity, datetime)
Enabled skills and tools
Secrets (API keys, channel tokens — AES-256-GCM encrypted)
Channel configurations (Telegram, Slack, WhatsApp, agent — see channels.md)

Instances are identified by a slug (e.g. hello-world, support-bot) and appear as selectable models in the OpenAI-compatible API.

Tier (`fast | standard | heavy`)

Polyant code never references a specific model name directly. Instead it asks the AI Gateway for a tier — an abstract capability level. The actual model is configured per-instance in ai-gateway/config.ts.

Tier	Use for	Default model (OpenAI / Anthropic)
`fast`	Classification, extraction, title generation	`gpt-4o-mini` / `claude-haiku-4-5-20251001`
`standard`	Main supervisor, user-facing conversations	`gpt-4o` / `claude-sonnet-4-5-20250929`
`heavy`	Complex reasoning, final review	`o3` / `claude-opus-4-6`

This decoupling means upgrading models is a configuration change, not a code change.

Supervisor

The central orchestrator agent. It reads the instance’s system prompt, has access to all enabled tools, and runs up to 15 reasoning steps per user turn. Powered by the Vercel AI SDK.

Tool

A named function the supervisor can invoke mid-conversation. Examples: httpRequest, searchMemory, saveMemory, spawnTask, readSkill.

Tools self-register at boot: drop a *.tool.ts file in packages/engine/src/agents/tools/ that calls registerTool(...) and it’s automatically discovered. No imports to update, no DI wiring. The supervisor queries the tool registry at each turn.

Per-instance enablement lives in the instance_tools table.

Skill

A Markdown-defined capability the supervisor can choose to “read” before acting. Skills give the agent domain-specific instructions, workflows, or knowledge that would clutter the main prompt.

A skill is invoked via the readSkill tool — the supervisor picks the best match from the <available_skills> list and loads its content on demand.

Skills can declare requiredEnv in YAML frontmatter (e.g. an OpenWeatherMap API key). These env vars are stored AES-256-GCM encrypted per-instance and injected at runtime.

Skills live in the skills + skill_versions tables. Per-instance enablement is in instance_skills.

Memory

Long-term, searchable facts the agent has learned about the user or the world. Implemented as:

Extraction: after each response, an LLM extracts candidate facts as structured JSON.
Embedding: each fact is embedded via OpenAI text-embedding-3-small.
Deduplication: cosine similarity > 0.90 against existing memories prevents duplicates.
Storage: pgvector column in the memories table.
Retrieval: hybrid search — pgvector cosine similarity + PostgreSQL full-text search, fused via Reciprocal Rank Fusion.

Memory is fire-and-forget — extraction does not block the user’s next message.

Room

A proactive workspace where the agent can act without being prompted. Rooms are event-driven: webhooks push events into the event_backlog, and a scheduler runs a ReAct cycle every 30 seconds that processes pending events and optionally sends outbound messages.

Each cycle creates a fresh conversation (room:{instanceId}:{timestamp}) — the room is not a persistent chat.

Core tables: instance_room, event_sources, event_definitions, event_backlog, room_activity_log.

Channel

A delivery medium for messages. Polyant tracks every message source through a union type — see concepts/channels.md for the full breakdown. In short:

Network transports (per-instance encrypted credentials): Telegram, Slack, WhatsApp (via Twilio).
In-process / system sources: web (OpenAI-compatible HTTP API + Playground), room (event-driven cycles), scheduled (cron-style tasks), agent (instance-to-instance calls).

Channel adapters for the network transports are per-instance and started/stopped dynamically.

Fire-and-forget

A pattern used for post-response tasks that shouldn’t block the user: memory extraction, conversation summary update, audit logging. Executed via setImmediate() after the response is sent — if they fail, the user still gets a reply.

Pipeline trace

Per-request phase timings written to the pipeline_traces table. Captures: context prep, tool building, LLM call (TTFB and total), total. Used by the analytics dashboard.

ESM `.js` imports

All relative TypeScript imports include a .js extension at the source level (e.g. import { foo } from "./bar.js") because the compiled output is pure ESM. This is a requirement of Node.js ESM resolution, not a typo.

More terms

Short definitions of the rest of the vocabulary you will encounter.

Activity log — the chronological record of room cycles, with daily/weekly/monthly compaction.

Activity stream — the in-process pub/sub bus that fans operational events (inbound, outbound, tool calls, memory writes, webhook matches, agent handoffs) to subscribers — chiefly the admin panel’s GET /api/activity-stream/live SSE endpoint. Bounded ring buffer, non-durable. See concepts/activity-stream.md.

Agent channel — the in-process agent ChannelAdapter (packages/engine/src/channels/adapters/agent.adapter.ts) that lets one instance synchronously invoke another, with no network hop. Carries callerSlug, callerConversationId, depth, parentTraceId. Distinct from spawnTask, which delegates inside the same instance.

AES-256-GCM — the symmetric encryption used for instance secrets and skill env vars.

Audit log — the database table that records every tool call.

Backlog — the queue of pending events for a Room, capped at 100 per event source.

Compaction — the process of collapsing old activity-log entries into summaries.

Conversation — a persistent thread of messages tied to one instance and one external user.

Embedding — a high-dimensional vector representation of text. Used for semantic search.

Event definition — within a Room event source, a named pattern that matches incoming payloads.

Event source — within a Room, a connection to one external sender (e.g. Stripe).

FTS — PostgreSQL Full-Text Search. Used in conversations and memory.

Harness tool — a tool injected only in specific contexts (e.g. inside Room cycles), invisible from the Tools tab.

JWE / JWT — JSON Web Encryption / JSON Web Token. The session-cookie format.

pgvector — the PostgreSQL extension for vector storage and similarity search.

Playground — the in-panel chat for testing instances.

Provider — an LLM vendor (OpenAI, Anthropic, AWS Bedrock).

RBAC — Role-Based Access Control.

Reciprocal Rank Fusion (RRF) — the algorithm that merges vector and keyword search rankings into one list.

Scheduled task — a row in the scheduled_tasks table that fires at a cron, fixed-interval, or one-shot trigger and runs a synthetic supervisor turn (with optional outbound delivery). Distinct from Rooms, which are event-driven rather than time-driven. See packages/engine/src/scheduled-tasks/scheduler.service.ts.

Slug — a URL-safe identifier for an instance or skill.

Soft debounce — the inbound-message coordinator’s coalescence window.

Streaming — sending the assistant response as Server-Sent Events token-by-token.

STT — Speech-To-Text. The stt-gateway transcribes voice notes (default: OpenAI whisper-1) into text before the supervisor sees the message. Provider adapters for Deepgram and AWS Transcribe ship alongside Whisper for future selection.