Polyant — Architecture
Monorepo Structure
This project is an npm workspaces monorepo with two packages:
| Package | Path | Role |
|---|---|---|
@polyant/engine | packages/engine/ | NestJS server — AI runtime + management API |
@polyant/web | packages/web/ | Next.js App Router — admin panel |
The root package.json orchestrates both packages. Infrastructure (Docker Compose, .env) lives at the monorepo root.
Overview
Polyant is an open-source platform for building AI assistants with long-term memory, multi-channel support, and per-instance customization. A single instance of the engine can host any number of independently-configured assistants (prompts, skills, tools, channels, secrets), each addressable as an OpenAI-compatible model. Every user interaction is routed through a Supervisor agent that orchestrates tools, sub-agents, and memory to produce contextual and proactive responses.
Tech Stack
| Component | Technology |
|---|---|
| Language | TypeScript / Node.js (ESM) |
| Agent Framework | Vercel AI SDK v4 |
| LLM | Provider-agnostic via AI Gateway (OpenAI + Anthropic + Bedrock) |
| Database | PostgreSQL 16 (Drizzle ORM) + pgvector + Full-Text Search (tsvector) |
| Memory | Native LLM extraction + pgvector (cosine similarity) + PostgreSQL FTS |
| Search | Hybrid: pgvector semantic + PostgreSQL FTS keyword, fused with RRF |
| Web Research | Tavily API |
| Encryption | AES-256-GCM (Node.js crypto) |
| Tracing | LangSmith |
| Channels | Telegram (grammY), Slack (Bolt), WhatsApp (Twilio Programmable Messaging) |
| HTTP Server | NestJS (OpenAI-compatible API + Management REST API) |
| Admin Panel | Next.js 15, React 19, Tailwind CSS 4, shadcn/ui |
| Infrastructure | Docker Compose |
Architecture
+------------------------------------------------------------+
| HTTP SERVER (NestJS) |
| /v1/chat/completions | /v1/models | /health |
| /api/instances | /api/conversations | /api/skills |
+------------------------------------------------------------+
| CHANNEL LAYER |
| Telegram | Slack | WhatsApp (Twilio) | Web (REST/SSE) |
| Room | Scheduled | Agent (in-process) |
| ChannelAdapter abstraction |
+------------------------------------------------------------+
| AGENT LAYER |
| Supervisor (tier: standard, max 15 steps) |
| Tools: searchMemory | webSearch | curl/httpRequest | |
| saveMemory | readSkill | readFile/writeFile | |
| HubSpot suite | GitHub | Slack/WhatsApp out | |
| scheduleTask | spawnTask | ... |
| Sub-agents: ad-hoc via spawnTask |
| Agent-to-agent: via `agent` channel adapter (not spawnTask)|
+------------------------------------------------------------+
| MEMORY LAYER |
| pgvector (cosine similarity) — extracted memories |
| PostgreSQL Full-Text Search (keyword) — conversations |
| Hybrid Search: RRF (Reciprocal Rank Fusion) |
| Native LLM extraction post-response (fire-and-forget) |
+------------------------------------------------------------+
| AI GATEWAY |
| Tier abstraction: fast | standard | heavy |
| Provider: OpenAI | Anthropic | Bedrock |
| Per-instance provider/model override |
| Logging: tokens, costs, latency -> PostgreSQL |
| Tracing: LangSmith |
+------------------------------------------------------------+
| DATA LAYER |
| PostgreSQL (conversations, memories, instances, |
| instance_secrets, instance_channels, ai_logs, |
| pipeline_traces) |
| pgvector extension (memory embeddings) |
+------------------------------------------------------------+Directory Structure
packages/engine/
src/
index.ts # Boot sequence
config.ts # Configuration (Zod schema)
workspace/
index.ts # Workspace resolver (per-instance parameterization)
ai-gateway/
index.ts # Gateway init + chat/chatStream
types.ts # ChatRequest, ChatResponse, ProviderAdapter
config.ts # Model tier mappings + pricing
logger.ts # AILogger (batch write to PostgreSQL)
langsmith.ts # LangSmith tracing setup
providers/
openai.ts # OpenAI via @ai-sdk/openai
anthropic.ts # Anthropic via @ai-sdk/anthropic
agents/
supervisor/
index.ts # supervise() + superviseStream()
prompt.ts # System prompt builder (DB-stored sections + skills + datetime)
tools/
registry.ts # Self-registration system + loadAllTools()
tools-sync.ts # Catalogue reconciliation (orphan cascade)
# ~35 *.tool.ts files — see reference/tool-catalog.md for the authoritative list
activity-stream/ # In-process pub/sub bus + SSE controller + LLM tap
attachments/ # Image/document/audio upload + S3 + signed URLs
audit/ # Per-tool-call audit table and scoped AuditLogger
knowledge/ # Per-instance knowledge files (search/read/write tools)
scheduled-tasks/ # Cron / interval / one-shot scheduler
stt-gateway/ # Voice-note transcription gateway
providers/
openai-whisper.ts # Default Whisper provider (whisper-1)
deepgram.ts # Alternative provider
aws-transcribe.ts # Alternative provider
users/ # Credentials provider controller + user-management routes + seed
auth/ # NestJS auth module (guard, JWT decryption) + Auth.js v5 schema (users, accounts, sessions)
workspace/ # Workspace resolver (ephemeral per-instance dir)
analytics/ # Analytics module logic (no controller — endpoints live in server/analytics/)
traces.schema.ts # Drizzle schema (pipeline_traces table)
trace.store.ts # TraceStore: buffered fire-and-forget pipeline trace writer
latency.store.ts # Latency analytics queries (percentiles, phase breakdown, tool stats)
memory/
index.ts # Entry point: initMemory() + re-exports
types.ts # Memory, ExtractedFact interfaces
embedder.ts # OpenAI embeddings (text-embedding-3-small)
memory-store.ts # pgvector upsert + cosine similarity dedup
hybrid-search.ts # Hybrid RRF search (pgvector + PG FTS)
extractor.ts # Native LLM extraction post-response
schema.ts # Drizzle schema (memories table with vector column)
conversations/
index.ts # Re-exports
store.ts # ConversationStore (messages, summaries, FTS)
schema.ts # Drizzle tables (conversations, conversation_messages)
types.ts # Conversation interfaces
instances/
schema.ts # Drizzle schema (instances table)
store.ts # Instance CRUD
skill-env.schema.ts # Drizzle schema (instance_skill_env table)
skill-env.store.ts # Encrypted skill env CRUD
secrets.schema.ts # Drizzle schema (instance_secrets table)
secrets.store.ts # Encrypted secrets CRUD
channels.schema.ts # Drizzle schema (instance_channels table)
channels.store.ts # Channel config CRUD
config-resolver.ts # Per-instance config with 30s TTL cache
skills/
skills.service.ts # Global skill library management
skills.controller.ts # /api/skills CRUD endpoints
crypto/
index.ts # AES-256-GCM encrypt/decrypt
channels/
types.ts # ChannelAdapter, IncomingMessage, OutgoingMessage
channel-manager.ts # Adapter orchestrator
adapters/
telegram/index.ts # grammY long polling
slack/index.ts # @slack/bolt Socket Mode
whatsapp/
index.ts # WhatsApp adapter (webhook)
twilio-client.ts # Twilio Programmable Messaging client
server/
main.ts # NestJS bootstrap
server.module.ts # Root module
health/health.controller.ts # GET /health
openai/
openai.controller.ts # /v1/chat/completions, /v1/models
openai.service.ts # Chat completion logic
openai.types.ts # Request/response OpenAI types
openai.module.ts # NestJS module
instances/instances.controller.ts # /api/instances CRUD + prompts/tools/skills/secrets/channels
conversations/conversations.controller.ts # /api/conversations
analytics/analytics.controller.ts # /api/analytics + per-instance analytics
memories/memories.controller.ts # /memories CRUD
database/
client.ts # Drizzle connection
migrations/ # Generated migrations
utils/
pipeline-logger.ts # Structured pipeline logging
frontmatter.ts # YAML frontmatter parser for skills
workspaces/ # Per-conversation tool sandboxes ONLY (gitignored)
<instanceId>/ # Knowledge lives in PostgreSQL (knowledge_documents + knowledge_chunks) — never here
conversations/<convId>/ # Ephemeral scratch dir used by readFile / writeFile / gitCloneRepo
packages/web/
src/
app/
globals.css # Design tokens
layout.tsx # Root layout (Inter font, ThemeProvider, I18nProvider)
(admin)/
layout.tsx # Sidebar + header layout
page.tsx # Dashboard
instances/ # Instance management (list, detail, tabs)
conversations/ # Conversation browsing + search
skills/ # Global skill library CRUD
playground/page.tsx # Playground chat page
memory/page.tsx # Memory management (list, search, create/delete)
settings/page.tsx # Global settings
components/
layout/ # Sidebar, header, nav, theme/lang toggles
analytics/ # Analytics chart components (KPIs, trends, latency)
ui/ # shadcn/ui components
lib/
api.ts # API client for engine
utils.ts # cn() helper
i18n/ # Italian/English internationalization
hooks/ # use-mobile, etc.Boot Sequence (packages/engine/src/index.ts)
At startup, the system executes in order:
- AI Gateway - Initializes logging
- Trace Store - Initializes pipeline latency trace writer (buffered, periodic flush)
- Tool Loading - Auto-discovers and registers all
*.tool.tsfiles - Memory - Verifies pgvector extension is available
- NestJS Server - OpenAI-compatible API + Management API on configurable port
- Channel Adapters - Loads enabled channel configs from DB per active instance, starts adapters dynamically
Request Flow
User message
|
v
[Channel Adapter] normalizes to IncomingMessage
|
v
[Pre-enrichment]
|- Load conversation summary (in-memory cache -> PostgreSQL)
'- Create conversation row if missing (fire-and-forget)
|
v
[Conversation History] getRecentMessages(conversationId, 15) from PostgreSQL
|
v
[Instance Resolution] resolveInstance(instanceId) -> load config from PostgreSQL (cache 30s)
|
v
[Supervisor] (tier: standard, max 15 steps)
|- System prompt: 8 sections from instance_prompts (per-instance, DB-stored)
|- Skills: discovered from skills + skill_versions + instance_skills (DB joins)
|- Last 15 messages + new message
'- Available tools (filtered per-instance via instance_tools):
searchMemory -> hybrid pgvector + PG FTS + RRF search
webSearch -> web search (Tavily, optional)
saveMemory -> explicit save (only on user request)
updateSoul -> modify personality
updateUserProfile -> update user info
readSkill -> load a skill's instructions
spawnTask -> delegate to sub-agent
... -> channel-specific and integration-specific tools
|
v
[Response] sent to channel
|
v
[After Response] (async, fire-and-forget)
|- 1. traceStore.record() -> pipeline latency trace (phase breakdown + tool timings)
|- 2. appendMessages() -> save to PostgreSQL conversation_messages
|- 3. Generate updated summary (tier: fast) -> updateSummary()
'- 4. extractMemories() -> LLM extraction -> embeddings -> pgvector upsertStreaming: For the HTTP server, superviseStream() returns an AsyncIterable<string>
that gets converted to Server-Sent Events (SSE) in OpenAI format.
AI Gateway
Tier Abstraction
Components don’t request specific models. They request a tier:
| Tier | OpenAI | Anthropic | Bedrock | Use |
|---|---|---|---|---|
fast | gpt-4o-mini | claude-haiku-4-5-20251001 | amazon.nova-lite-v1:0 | Summary generation, memory extraction, classification |
standard | gpt-4o | claude-sonnet-4-5-20250929 | anthropic.claude-sonnet-4-20250514-v1:0 | Supervisor, sub-agents |
heavy | o3 | claude-opus-4-6 | anthropic.claude-opus-4-20250514-v1:0 | Complex analysis |
The exact mappings live in packages/engine/src/ai-gateway/config.ts.
The AI provider is configured per-instance via the admin panel (Settings tab). There is no global AI_PROVIDER env var.
Individual instances can also override the model via the Management API.
Logging and Costs
Every LLM call is logged to the ai_logs table with:
- Provider, model, tier
- Token usage (prompt, completion, total)
- Estimated cost in USD (calculated from per-token pricing)
- Duration in ms
- conversationId and instanceId for correlation
The logger uses a buffer with periodic flush to minimize DB writes.
LangSmith Tracing
LangSmith tracing is per-instance, configured via admin panel Settings tab (langsmithEnabled, API key, project name).
Uses wrapAISDK from langsmith/experimental/vercel to wrap generateText/streamText at module level.
Per-request tracing config is built via buildLangSmithProviderOptions() and passed as providerOptions.langsmith to the AI SDK calls.
When providerOptions.langsmith is absent (LangSmith disabled), wrapped functions behave identically to the originals with zero overhead.
Produces hierarchical traces: parent run per generateText/streamText call, with child runs for each LLM step and tool execution within maxSteps.
Thread grouping via metadata.thread_id = conversationId. Instance filtering via metadata.instance_id.
Client instances are cached per API key in-memory.
Agent System
Supervisor
The Supervisor is the system’s decision-making center. It receives the user message, the last 15 messages of history, and the conversation summary.
Configuration:
- Tier:
standard - Max steps: 15 (reasoning cycles)
- System prompt: 8 modular sections stored in the
instance_promptstable (per-instance, DB-backed). Defaults are seeded frompackages/engine/src/instances/defaults.tson instance creation - Available tools: filtered per-instance via the
instance_toolstable (auto-recomputed when skills change)
Tools shipped with the framework (non-exhaustive — the authoritative list is generated from the registry and lives in reference/tool-catalog.md; the actual set per instance is determined by instance_tools):
| Tool | Description | When to use |
|---|---|---|
searchMemory | Hybrid pgvector + PG FTS search across memories and conversations | Proactively for any question about past facts |
saveMemory | Explicit memory save to pgvector | Only on user request |
webSearch | Web search via Tavily API | For external/current information |
httpRequest / curl | Generic HTTP request | Fetch pages, APIs, JSON, RSS |
updateSoul | Modify the assistant’s personality (section 02-soul) | Only on user request |
updateUserProfile | Update the user profile (section 07-user-identity) | When user shares personal info |
readSkill | Load a skill’s instructions on-demand | When the supervisor needs to apply a skill |
readFile / writeFile / listDirectory | Filesystem access inside the workspace | When operating on files inside an ephemeral workspace |
searchKnowledge / getKnowledge / writeKnowledge | Knowledge base operations | When working with the instance’s knowledge files |
gitCloneRepo | Clone a GitHub repo into the workspace | Code-touching workflows |
ghIssue / ghPR | GitHub issue + pull request operations | GitHub integrations |
| HubSpot suite (8 tools) | hubspotContact, hubspotDeal, hubspotMeeting, hubspotNote, hubspotTicket, hubspotCreateTask, hubspotSendEmail, hubspotGetCompany | HubSpot CRM workflows |
slackPostMessage | Post to a Slack channel or DM via the instance’s Slack credentials | Outbound Slack messages |
send_whatsapp_template | Send a Twilio-approved WhatsApp template | WhatsApp 24h-window outbound |
send_outbound_message | Send a message to any configured outbound channel | Channel-agnostic outbound |
scheduleTask | Create a cron / interval / one-shot task | Time-based automation |
fileUpload | Upload an attachment to S3 | Attachment workflows |
verifyDocument | Validate a document via a tool-level LLM call | Document QA |
| Room harness tools | mark_events_completed, compact_room_history, send_message_to_human | Injected only inside Room cycles |
spawnTask | Delegate to an isolated sub-agent in the same instance | Multi-step tasks within one instance |
Channel-specific and integration-specific tools are also included in the framework and can be enabled per-instance.
Sub-Agent System
Sub-agents are isolated agents that the Supervisor can delegate tasks to via spawnTask.
They receive all enabled tools except spawnTask, to prevent infinite recursion.
There is no dedicated sub-agents/ directory or SubAgentDefinition type today — the spawnTask tool (packages/engine/src/agents/tools/task-tool.ts) creates ad-hoc sub-agents on the fly with a generic system prompt and the parent’s filtered tool set.
Note: spawnTask is unrelated to agent-to-agent calls between different instances, which go through the agent channel adapter (channels/adapters/agent.adapter.ts). See agents.md for the disambiguation.
Memory System
Architecture
The memory system is fully native — no external services required beyond PostgreSQL with pgvector.
| Component | Role |
|---|---|
LLM Extractor (extractor.ts) | Sends recent messages to the project’s LLM (tier: fast) for structured fact extraction |
Embedder (embedder.ts) | Generates embeddings via OpenAI (text-embedding-3-small) |
Memory Store (memory-store.ts) | Upserts memories into pgvector with cosine similarity deduplication (threshold 0.90) |
| PostgreSQL FTS | Full-text search on raw conversations (conversation_messages table with tsvector column) |
Memory Flow
User conversation <-> Supervisor
|
v (fire-and-forget after response)
extractMemories(conversationId, instanceId)
|
v
Load last 15 messages from PostgreSQL
|
v
Send transcript to LLM (tier: fast)
|
v
LLM returns JSON: [{content, category, importance}]
|
v
Generate embeddings (OpenAI text-embedding-3-small)
|
v
Upsert each memory into pgvector:
- Cosine similarity check against existing memories
- If similarity > 0.90: update existing memory
- Otherwise: insert new memoryCategories: preference, fact, event, relationship, decision, general Importance: 1-10 scale (10 = critical life fact, 1 = trivial)
Hybrid Search (RRF)
Search combines two backends via Reciprocal Rank Fusion:
searchMemory(query)
|
+-----------+-----------+
| |
pgvector semantic search PostgreSQL FTS
(extracted memories) (raw conversations)
cosine similarity websearch_to_tsquery
top 20 results top 20 results
| |
+-----------+-----------+
|
Reciprocal Rank Fusion
score = Σ(1 / (k + rank + 1))
k = 60
|
Sort + dedup + top N
|
HybridSearchResult[]
{content, type, score, source, createdAt}Why RRF: The two backends produce scores on different scales (cosine similarity vs ts_rank). RRF implicitly normalizes based only on ranking position, not absolute values.
Complementarity:
- Semantic (pgvector): finds memories by meaning, even with different words
- Keyword (PG FTS): finds conversations by exact words, proper nouns, dates
Note: Embeddings always use OpenAI (
text-embedding-3-small) — the per-instanceopenai_api_keysecret is required regardless of the instance’s AI provider. Anthropic does not offer an embedding API. The extraction LLM uses the configured provider via ai-gateway.
Conversation System
PostgreSQL Tables
conversations:
conversationId(text, unique) - formatchannelType:channelIdsummary(text, nullable) - updated after each turninstanceId(text, nullable)createdAt,updatedAt
conversation_messages (with Full-Text Search):
conversationId(text) - indexedrole(user|assistant)content(text)toolCalls(jsonb, nullable)search_vector(tsvector) - auto-generated fromcontent(config:simple, managed via SQL migration)createdAt- indexed- GIN index on
search_vector
ConversationStore
// Summary management (in-memory cache)
getSummary(conversationId) // cache hit -> return; miss -> query DB -> cache
updateSummary(conversationId) // write DB + update cache
ensureConversation(id, instanceId) // INSERT ... ON CONFLICT DO NOTHING
// Message management
appendMessages(conversationId, messages[]) // batch insert
getRecentMessages(conversationId, limit=15) // last N messages (chronological)
// Full-text search
searchConversations(query, {instanceId, limit, offset}) // websearch_to_tsquery + ts_rank
// Listing
listConversations({instanceId, limit, offset}) // paginated with instance JOIN
getConversation(conversationId) // single conversation detail
deleteConversation(conversationId) // delete conversation + messagesSummary Generation
After each response, an LLM (tier: fast) generates an updated conversation summary in 2-3 sentences. The previous summary is provided as context. This summary is injected into the Supervisor’s system prompt in subsequent conversations.
Channel Layer
Three adapters with a common ChannelAdapter interface:
| Channel | Package | Reception | Notes |
|---|---|---|---|
| Telegram | grammY | Long polling | Markdown parse mode, optional user ID whitelist |
| Slack | @slack/bolt | Socket Mode | Thread awareness, rich metadata |
| Twilio Programmable Messaging | Inbound webhook | twilio-client.ts posts outbound; media (MediaUrl) downloaded and re-uploaded to S3 | |
| Agent | (in-process) | Synchronous call | agent.adapter.ts; carries callerSlug, callerConversationId, depth, parentTraceId |
Channels are DB-driven and per-instance. Config is stored encrypted in the instance_channels table.
The channel manager starts/stops adapters dynamically via admin panel or API. No global env vars for channel credentials.
Webhooks & Event Sources
External systems (HubSpot, custom APIs, etc.) can trigger agent actions via webhooks.
A webhook is received at POST /webhooks/:webhookToken (always returns 200 OK immediately, processing is fire-and-forget).
The payload (max 64KB) is passed to a webhook matcher LLM (tier: fast) which sequentially evaluates matching definitions (first match wins).
Matched events are inserted into the event_backlog table with pending status, where they await processing by the Room scheduler.
Key components:
webhook-engine.ts— Central dispatch and queue managementwebhook-matcher.ts— LLM-based event classification against priority-ordered definitionswebhook-sources.store.ts— Event source CRUD (with AES-256-GCM encrypted config)webhook-backlog.store.ts— Pending event queue with status lifecycle (pending → processing → completed)webhook.validators.ts,webhooks.schema.ts— Payload validation and schemastemplate-renderer.ts,trigger-context.ts— Context building for interpretationactive-triggers.ts,webhook-logger.ts— Trigger state and audit logging
Rate limit: 60 events/min. Backlog capacity: 100 pending events per instance (excess dropped).
Scheduled Tasks
The scheduler provides time-based and event-based task execution with optional result delivery via outbound channels.
Schema:
schedule_type:cron(recurring),interval(every N seconds), oroneShot(run once then delete)status:enabled,disabled(suspended),error(automatically disabled after N consecutive failures)retry_count,retry_delay_ms,max_consecutive_errors— Backoff and auto-disable logicresult_channel,result_target— Optional outbound channel for delivery (e.g., Slack, WhatsApp, email)
Key components:
scheduler.service.ts— Singleton scheduler with per-task queuing and tick-based evaluationstore.ts— Task CRUD and status managementschedule-utils.ts— Cron parsing and interval calculationrun-log.store.ts— Execution history (run_id, start_time, duration, status, output)
Tick interval is configurable. Each completed run can send its result to a configured channel (e.g., notification via Slack).
Automatic disable occurs after max_consecutive_errors consecutive failures.
Activity Stream (SSE)
The activity stream provides real-time updates to the admin panel via Server-Sent Events (SSE), replacing the prior polling-based feed. Emits structured events from the agent pipeline: inbound/outbound messages, tool calls, memory extraction, scheduled-task fires, webhook matches, and agent-to-agent handoffs.
Components:
bus-emitter.ts— Event bus with subscription/unsubscriptionevent-formatters.ts— Transforms domain events into SSE payloads- Controller endpoint —
GET /api/activity-stream/livewith client-managed reconnection
Used by the admin panel’s activity dashboard for live visibility without polling overhead.
Attachment Pipeline
Messages can carry non-text payloads (images, documents, audio) from inbound channels.
WhatsApp media (via Twilio MediaUrl) is downloaded, validated, and uploaded to S3.
Outbound attachment references are served via a reverse proxy endpoint or signed S3 URLs to prevent direct exposure.
Data flow:
- Channel adapter (WhatsApp) extracts
MediaUrland metadata from inbound message - Binary blob is uploaded to S3 via
platform-storage.tswhenPLATFORM_S3_*env vars are configured (otherwise the attachment is skipped or left as a remote reference) - Attachment metadata (key, mimeType, size, etc.) is appended to the
attachmentsJSONB column on the correspondingconversation_messagesrow - Outbound delivery uses signed S3 URLs
Key components:
packages/engine/src/attachments/platform-storage.ts— S3 helpers (upload + signed URL generation). No dedicated controller, nomessage_attachmentstable.- Attachment storage —
jsonb attachmentscolumn onconversation_messages(seepackages/engine/src/conversations/schema.ts); each entry holds the S3 key plus metadata.
HTTP Server (NestJS)
OpenAI-Compatible API
| Method | Path | Description |
|---|---|---|
| GET | /health | Health check |
| GET | /v1/models | List instances as models |
| POST | /v1/chat/completions | Chat completion (sync and streaming SSE) |
Management API — Instances
| Method | Path | Description |
|---|---|---|
| GET | /api/instances | List all instances |
| POST | /api/instances | Create instance |
| GET | /api/instances/models | List available providers and models |
| GET | /api/instances/:slug | Get instance by slug |
| PATCH | /api/instances/:slug | Update instance |
| DELETE | /api/instances/:slug | Delete instance + workspace |
| GET | /api/instances/:slug/prompts | Get prompt sections |
| PATCH | /api/instances/:slug/prompts | Update prompt sections |
| GET | /api/instances/:slug/tools | Get tools with enabled status |
| PATCH | /api/instances/:slug/tools | Update enabled tools |
| GET | /api/instances/:slug/skills | Get skills with enabled/env status |
| PATCH | /api/instances/:slug/skills | Update enabled skills |
| GET | /api/instances/:slug/skills/:skillSlug/env | Get skill env vars |
| PUT | /api/instances/:slug/skills/:skillSlug/env | Set skill env vars |
| DELETE | /api/instances/:slug/skills/:skillSlug/env/:key | Delete skill env var |
Management API — Conversations
| Method | Path | Description |
|---|---|---|
| GET | /api/conversations | List conversations (paginated, filterable, searchable) |
| GET | /api/conversations/:id | Get conversation detail |
| GET | /api/conversations/:id/messages | Get conversation messages (paginated) |
| DELETE | /api/conversations/:id | Delete conversation |
Management API — Secrets
| Method | Path | Description |
|---|---|---|
| GET | /api/instances/:slug/secrets | List secret keys + configured status (never values) |
| PUT | /api/instances/:slug/secrets | Bulk upsert secrets |
| DELETE | /api/instances/:slug/secrets/:key | Delete secret |
Management API — Channels
| Method | Path | Description |
|---|---|---|
| GET | /api/instances/:slug/channels | List channel configs |
| PUT | /api/instances/:slug/channels/:type | Set channel config |
| DELETE | /api/instances/:slug/channels/:type | Delete channel config |
Management API — Analytics
| Method | Path | Description |
|---|---|---|
| GET | /api/analytics | Global analytics (KPIs, token usage, latency) |
| GET | /api/instances/:slug/analytics | Per-instance analytics (incl. latency) |
Management API — Memories
| Method | Path | Description |
|---|---|---|
| GET | /memories | List memories (paginated, searchable) |
| POST | /memories | Create memory |
| DELETE | /memories/:id | Delete memory |
| DELETE | /memories | Delete all memories (with optional instanceId filter) |
Management API — Skills (Global Library)
| Method | Path | Description |
|---|---|---|
| GET | /api/skills | List all skills in the global library |
| GET | /api/skills/:name | Get skill by name |
| POST | /api/skills | Create skill |
| PUT | /api/skills/:name | Update skill |
| DELETE | /api/skills/:name | Delete skill |
Authentication
Auth is per-instance. If authEnabled is set on the instance, calls to /v1/* require
Authorization: Bearer <auth_api_key> (configured via admin panel Settings tab).
If authEnabled is false, access is open for that instance.
Infrastructure (Docker Compose)
services:
postgres: # PostgreSQL 16 (pgvector), port 5432Persistent volume: polyant-pgdata (PostgreSQL).
Configuration (.env)
Only infrastructure variables remain in .env. AI provider keys, LangSmith, auth, Tavily, and channel credentials are configured per-instance via the admin panel Settings/Channels tabs.
# Database
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=polyant
POSTGRES_USER=polyant
POSTGRES_PASSWORD=changeme
# HTTP Server
API_PORT=4000
# Encryption
ENCRYPTION_KEY=... # 32-byte hex key for AES-256-GCM (instance secrets)
# Instance
DEFAULT_INSTANCE_ID=default # default instanceId for single-instance setupDesign Patterns and Technical Choices
1. Native Memory Extraction
Memory extraction runs entirely in-process: the project’s LLM (tier: fast) extracts structured
facts from conversation transcripts, OpenAI generates embeddings, and pgvector stores them with
cosine similarity deduplication. No external memory services required.
2. Hybrid Search with RRF
Search combines pgvector (semantic) and PostgreSQL FTS (keyword) via Reciprocal Rank Fusion. This approach implicitly normalizes heterogeneous scores based only on rankings.
3. Real-time Extraction (not Batch)
After each supervisor response, a fire-and-forget process extracts memories from the last 15 messages. No scheduled jobs, no nightly batches.
4. PostgreSQL for Conversation Storage
All messages are saved in PostgreSQL (conversation_messages). The auto-generated tsvector
column enables full-text search without additional components.
5. Tier Abstraction (not Model Binding)
Components request a tier (fast, standard, heavy), not a specific model.
The tier-to-model mapping is centralized in packages/engine/src/ai-gateway/config.ts.
Instances can override provider/model for fine-grained control.
6. Pipeline Latency Tracing
Every user message (excluding auto-tasks like Open WebUI title/summary generation) is instrumented
with per-phase timing. The TraceStore buffers entries and flushes to pipeline_traces in batches
(every 10 entries or every 5 seconds), following the same pattern as AILogger.
Phases tracked: context prep, tool building, LLM call, total.
Additional data: individual tool call durations (toolCallTraces JSONB), streaming TTFB, token counts.
The pipeline_traces table is separate from ai_logs: ai_logs tracks individual LLM API calls
(including background tasks like summary generation), while pipeline_traces tracks end-to-end
pipeline latency for user-facing messages only.
Analytics queries in latency.store.ts use PostgreSQL percentile_cont() for p50/p95/p99
and jsonb_array_elements() for tool call breakdown. Results are served alongside existing
analytics via both global and per-instance endpoints.
7. Fire-and-Forget Post-Processing
The afterResponse() function runs asynchronously without blocking the response:
saves messages, updates summary, extracts memories. If it fails, the error is logged
but the user has already received the response.
8. OpenAI-Compatible API
The server exposes endpoints in OpenAI format (/v1/chat/completions, /v1/models).
Compatible with Open WebUI, ChatBox, and any OpenAI-compatible client.
9. Self-Registering Tools
Tools are defined as *.tool.ts files that call registerTool() at module level.
They are auto-discovered at boot by loadAllTools(). No hardcoded imports or manual
wiring needed. Each tool can declare requiredEnv — if the env var is missing, the tool
is silently excluded.
10. Instance Personalization (Database-First)
All instance configuration — prompts, skills, tool enablement, secrets, channel
credentials, and knowledge documents — is stored in PostgreSQL, not on the
filesystem. New instances are seeded from the defaults defined in
packages/engine/src/instances/defaults.ts. The workspaces/<instanceId>/
directory exists only as a sandbox root for per-conversation tool work
(readFile / writeFile / gitCloneRepo under conversations/<convId>/); it
is never the source of truth for any agent configuration.
11. Encrypted Skill Environment Variables
Skills can declare requiredEnv in YAML frontmatter. Values are encrypted with AES-256-GCM
per-instance in the instance_skill_env table and injected at runtime when the agent
reads the skill.
Main Dependencies
| Package | Version | Role |
|---|---|---|
ai | ^4.0.0 | Vercel AI SDK (core) |
@ai-sdk/openai | ^1.0.0 | OpenAI provider |
@ai-sdk/anthropic | ^1.0.0 | Anthropic provider |
@nestjs/core | ^11.1.13 | HTTP server |
drizzle-orm | ^0.38.0 | PostgreSQL ORM |
grammy | ^1.40.0 | Telegram bot |
@slack/bolt | ^4.6.0 | Slack bot |
@tavily/core | ^0.0.3 | Web search |
langsmith | ^0.3.0 | LLM tracing |
zod | ^3.23.0 | Schema validation |
Commands
All commands run from the monorepo root and delegate to the appropriate workspace:
# Engine (AI runtime)
npm run dev # Start engine with tsx watch
npm run dev:engine # Same as above (explicit)
npm run build:engine # Build engine only
npm start # Run engine from dist/
# Web (admin panel)
npm run dev:web # Start Next.js dev server
npm run build:web # Build web only
# All workspaces
npm run build # Build all packages
npm run lint # ESLint all packages
npm run typecheck # TypeScript check all packages
npm test # Run all tests
# Database (engine)
npm run db:generate # Generate Drizzle migrations
npm run db:migrate # Apply migrations
npm run db:studio # Drizzle Studio GUI
# Engine tests
npm run test:unit # Unit tests only
npm run test:integration # Integration tests only
npm run test:functional # Functional tests only
# Docker
docker compose up -d # Start postgres + open-webuiRoom & Event Sources
The Room system enables proactive, event-driven agent behavior — the agent doesn’t just respond to user messages, it listens to external system events and takes initiative.
Data Flow
External System (HubSpot, etc.)
│
▼
POST /webhooks/:webhookToken ← always returns 200 OK (fire-and-forget)
│
▼
WebhookController.processEvent() ← validation cascade:
│ token → source enabled → room enabled
│ → backlog cap (100) → definitions → slug
▼
Event Matcher (LLM tier: fast) ← sequential, first-match-wins
│
▼
Event Backlog (status: pending) ← queue in PostgreSQL
│
▼
Room Scheduler (30s tick) ← per-room mutex, parallel across rooms
│
▼
Room Engine (executeRoomCycle) ← builds synthetic message with:
│ pending events + interpretation prompts
│ + human message (if any)
▼
Supervisor (standard LLM call) ← same supervisor as user messages,
│ memory disabled for room cycles
▼
Outbound Channel ← Slack / WhatsApp / Telegram
│
▼
Human ← reply routes back via triggerImmediate()Key Design Decisions
- One Room per instance: stored in
instance_room(1:1 withinstances). Room prompt defines the agent’s mandate for event processing. - Event Sources are generic:
source_typefield supports extensible source types (currentlyhubspot,webhook). Config is AES-256-GCM encrypted. - Event Definitions are priority-ordered: sequential LLM evaluation, first match wins. Each definition has a
matchingPrompt(for the classifier) and aninterpretationPrompt(injected into the Room cycle). - Activity log auto-compacts: daily entries kept 7 days → merged into weekly (4 weeks) → merged into monthly (12 months) → deleted.
- Room scheduler is a singleton: per-room mutex prevents concurrent processing of the same room. Different rooms run in parallel.
Database Tables
| Table | Purpose |
|---|---|
instance_room | Room config: prompt, outbound channel/target, intervals, rate limit |
event_sources | External event connectors with encrypted config and webhook token |
event_definitions | Matching + interpretation prompts per event type |
event_backlog | Queue of matched events: pending → processing → completed |
room_activity_log | Time-decaying activity chronicle with auto-compaction |
API Endpoints
| Endpoint | Purpose |
|---|---|
GET/PUT/DELETE /api/instances/:slug/room | Room CRUD |
GET/POST /api/instances/:slug/event-sources | Event source list (with definitions inline) + create |
PUT/DELETE /api/instances/:slug/event-sources/:id | Update/delete event source |
POST /api/instances/:slug/event-sources/:id/rotate-token | Rotate webhook token |
GET/POST /api/instances/:slug/event-sources/:id/definitions | Definition list + create |
PUT/DELETE /api/instances/:slug/event-sources/:id/definitions/:defId | Update/delete definition |
POST /webhooks/:webhookToken | External event ingestion |
Extension Points
| What | How |
|---|---|
| New channel | Implement ChannelAdapter, register it in packages/engine/src/index.ts |
| New sub-agent | See AGENTS.md for architecture and extension patterns |
| New AI provider | Implement ProviderAdapter, add it to packages/engine/src/ai-gateway/index.ts |
| New supervisor tool | Create a *.tool.ts file calling registerTool() in packages/engine/src/agents/tools/ — the registry picks it up at boot and the tools DB table is synced automatically. Per-instance enablement is then controlled via instance_tools (admin panel or PATCH /api/instances/:slug/tools) |
| New LLM tier | Extend ModelTier and mapping in packages/engine/src/ai-gateway/config.ts |
| New global skill | Add a row to skills + skill_versions via the management API (POST /api/skills) or insert directly. Per-instance assignment is then controlled via instance_skills |
| New event source type | Add config schema in eventSourceConfigSchemas (event-sources.store.ts), webhook payload handled by generic matcher |