Skip to Content
Polyant is open source under AGPL-3.0 — star us on GitHub.
ConceptsArchitecture

Polyant — Architecture

Monorepo Structure

This project is an npm workspaces monorepo with two packages:

PackagePathRole
@polyant/enginepackages/engine/NestJS server — AI runtime + management API
@polyant/webpackages/web/Next.js App Router — admin panel

The root package.json orchestrates both packages. Infrastructure (Docker Compose, .env) lives at the monorepo root.

Overview

Polyant is an open-source platform for building AI assistants with long-term memory, multi-channel support, and per-instance customization. A single instance of the engine can host any number of independently-configured assistants (prompts, skills, tools, channels, secrets), each addressable as an OpenAI-compatible model. Every user interaction is routed through a Supervisor agent that orchestrates tools, sub-agents, and memory to produce contextual and proactive responses.

Tech Stack

ComponentTechnology
LanguageTypeScript / Node.js (ESM)
Agent FrameworkVercel AI SDK v4
LLMProvider-agnostic via AI Gateway (OpenAI + Anthropic + Bedrock)
DatabasePostgreSQL 16 (Drizzle ORM) + pgvector + Full-Text Search (tsvector)
MemoryNative LLM extraction + pgvector (cosine similarity) + PostgreSQL FTS
SearchHybrid: pgvector semantic + PostgreSQL FTS keyword, fused with RRF
Web ResearchTavily API
EncryptionAES-256-GCM (Node.js crypto)
TracingLangSmith
ChannelsTelegram (grammY), Slack (Bolt), WhatsApp (Twilio Programmable Messaging)
HTTP ServerNestJS (OpenAI-compatible API + Management REST API)
Admin PanelNext.js 15, React 19, Tailwind CSS 4, shadcn/ui
InfrastructureDocker Compose

Architecture

+------------------------------------------------------------+ | HTTP SERVER (NestJS) | | /v1/chat/completions | /v1/models | /health | | /api/instances | /api/conversations | /api/skills | +------------------------------------------------------------+ | CHANNEL LAYER | | Telegram | Slack | WhatsApp (Twilio) | Web (REST/SSE) | | Room | Scheduled | Agent (in-process) | | ChannelAdapter abstraction | +------------------------------------------------------------+ | AGENT LAYER | | Supervisor (tier: standard, max 15 steps) | | Tools: searchMemory | webSearch | curl/httpRequest | | | saveMemory | readSkill | readFile/writeFile | | | HubSpot suite | GitHub | Slack/WhatsApp out | | | scheduleTask | spawnTask | ... | | Sub-agents: ad-hoc via spawnTask | | Agent-to-agent: via `agent` channel adapter (not spawnTask)| +------------------------------------------------------------+ | MEMORY LAYER | | pgvector (cosine similarity) — extracted memories | | PostgreSQL Full-Text Search (keyword) — conversations | | Hybrid Search: RRF (Reciprocal Rank Fusion) | | Native LLM extraction post-response (fire-and-forget) | +------------------------------------------------------------+ | AI GATEWAY | | Tier abstraction: fast | standard | heavy | | Provider: OpenAI | Anthropic | Bedrock | | Per-instance provider/model override | | Logging: tokens, costs, latency -> PostgreSQL | | Tracing: LangSmith | +------------------------------------------------------------+ | DATA LAYER | | PostgreSQL (conversations, memories, instances, | | instance_secrets, instance_channels, ai_logs, | | pipeline_traces) | | pgvector extension (memory embeddings) | +------------------------------------------------------------+

Directory Structure

packages/engine/ src/ index.ts # Boot sequence config.ts # Configuration (Zod schema) workspace/ index.ts # Workspace resolver (per-instance parameterization) ai-gateway/ index.ts # Gateway init + chat/chatStream types.ts # ChatRequest, ChatResponse, ProviderAdapter config.ts # Model tier mappings + pricing logger.ts # AILogger (batch write to PostgreSQL) langsmith.ts # LangSmith tracing setup providers/ openai.ts # OpenAI via @ai-sdk/openai anthropic.ts # Anthropic via @ai-sdk/anthropic agents/ supervisor/ index.ts # supervise() + superviseStream() prompt.ts # System prompt builder (DB-stored sections + skills + datetime) tools/ registry.ts # Self-registration system + loadAllTools() tools-sync.ts # Catalogue reconciliation (orphan cascade) # ~35 *.tool.ts files — see reference/tool-catalog.md for the authoritative list activity-stream/ # In-process pub/sub bus + SSE controller + LLM tap attachments/ # Image/document/audio upload + S3 + signed URLs audit/ # Per-tool-call audit table and scoped AuditLogger knowledge/ # Per-instance knowledge files (search/read/write tools) scheduled-tasks/ # Cron / interval / one-shot scheduler stt-gateway/ # Voice-note transcription gateway providers/ openai-whisper.ts # Default Whisper provider (whisper-1) deepgram.ts # Alternative provider aws-transcribe.ts # Alternative provider users/ # Credentials provider controller + user-management routes + seed auth/ # NestJS auth module (guard, JWT decryption) + Auth.js v5 schema (users, accounts, sessions) workspace/ # Workspace resolver (ephemeral per-instance dir) analytics/ # Analytics module logic (no controller — endpoints live in server/analytics/) traces.schema.ts # Drizzle schema (pipeline_traces table) trace.store.ts # TraceStore: buffered fire-and-forget pipeline trace writer latency.store.ts # Latency analytics queries (percentiles, phase breakdown, tool stats) memory/ index.ts # Entry point: initMemory() + re-exports types.ts # Memory, ExtractedFact interfaces embedder.ts # OpenAI embeddings (text-embedding-3-small) memory-store.ts # pgvector upsert + cosine similarity dedup hybrid-search.ts # Hybrid RRF search (pgvector + PG FTS) extractor.ts # Native LLM extraction post-response schema.ts # Drizzle schema (memories table with vector column) conversations/ index.ts # Re-exports store.ts # ConversationStore (messages, summaries, FTS) schema.ts # Drizzle tables (conversations, conversation_messages) types.ts # Conversation interfaces instances/ schema.ts # Drizzle schema (instances table) store.ts # Instance CRUD skill-env.schema.ts # Drizzle schema (instance_skill_env table) skill-env.store.ts # Encrypted skill env CRUD secrets.schema.ts # Drizzle schema (instance_secrets table) secrets.store.ts # Encrypted secrets CRUD channels.schema.ts # Drizzle schema (instance_channels table) channels.store.ts # Channel config CRUD config-resolver.ts # Per-instance config with 30s TTL cache skills/ skills.service.ts # Global skill library management skills.controller.ts # /api/skills CRUD endpoints crypto/ index.ts # AES-256-GCM encrypt/decrypt channels/ types.ts # ChannelAdapter, IncomingMessage, OutgoingMessage channel-manager.ts # Adapter orchestrator adapters/ telegram/index.ts # grammY long polling slack/index.ts # @slack/bolt Socket Mode whatsapp/ index.ts # WhatsApp adapter (webhook) twilio-client.ts # Twilio Programmable Messaging client server/ main.ts # NestJS bootstrap server.module.ts # Root module health/health.controller.ts # GET /health openai/ openai.controller.ts # /v1/chat/completions, /v1/models openai.service.ts # Chat completion logic openai.types.ts # Request/response OpenAI types openai.module.ts # NestJS module instances/instances.controller.ts # /api/instances CRUD + prompts/tools/skills/secrets/channels conversations/conversations.controller.ts # /api/conversations analytics/analytics.controller.ts # /api/analytics + per-instance analytics memories/memories.controller.ts # /memories CRUD database/ client.ts # Drizzle connection migrations/ # Generated migrations utils/ pipeline-logger.ts # Structured pipeline logging frontmatter.ts # YAML frontmatter parser for skills workspaces/ # Per-conversation tool sandboxes ONLY (gitignored) <instanceId>/ # Knowledge lives in PostgreSQL (knowledge_documents + knowledge_chunks) — never here conversations/<convId>/ # Ephemeral scratch dir used by readFile / writeFile / gitCloneRepo packages/web/ src/ app/ globals.css # Design tokens layout.tsx # Root layout (Inter font, ThemeProvider, I18nProvider) (admin)/ layout.tsx # Sidebar + header layout page.tsx # Dashboard instances/ # Instance management (list, detail, tabs) conversations/ # Conversation browsing + search skills/ # Global skill library CRUD playground/page.tsx # Playground chat page memory/page.tsx # Memory management (list, search, create/delete) settings/page.tsx # Global settings components/ layout/ # Sidebar, header, nav, theme/lang toggles analytics/ # Analytics chart components (KPIs, trends, latency) ui/ # shadcn/ui components lib/ api.ts # API client for engine utils.ts # cn() helper i18n/ # Italian/English internationalization hooks/ # use-mobile, etc.

Boot Sequence (packages/engine/src/index.ts)

At startup, the system executes in order:

  1. AI Gateway - Initializes logging
  2. Trace Store - Initializes pipeline latency trace writer (buffered, periodic flush)
  3. Tool Loading - Auto-discovers and registers all *.tool.ts files
  4. Memory - Verifies pgvector extension is available
  5. NestJS Server - OpenAI-compatible API + Management API on configurable port
  6. Channel Adapters - Loads enabled channel configs from DB per active instance, starts adapters dynamically

Request Flow

User message | v [Channel Adapter] normalizes to IncomingMessage | v [Pre-enrichment] |- Load conversation summary (in-memory cache -> PostgreSQL) '- Create conversation row if missing (fire-and-forget) | v [Conversation History] getRecentMessages(conversationId, 15) from PostgreSQL | v [Instance Resolution] resolveInstance(instanceId) -> load config from PostgreSQL (cache 30s) | v [Supervisor] (tier: standard, max 15 steps) |- System prompt: 8 sections from instance_prompts (per-instance, DB-stored) |- Skills: discovered from skills + skill_versions + instance_skills (DB joins) |- Last 15 messages + new message '- Available tools (filtered per-instance via instance_tools): searchMemory -> hybrid pgvector + PG FTS + RRF search webSearch -> web search (Tavily, optional) saveMemory -> explicit save (only on user request) updateSoul -> modify personality updateUserProfile -> update user info readSkill -> load a skill's instructions spawnTask -> delegate to sub-agent ... -> channel-specific and integration-specific tools | v [Response] sent to channel | v [After Response] (async, fire-and-forget) |- 1. traceStore.record() -> pipeline latency trace (phase breakdown + tool timings) |- 2. appendMessages() -> save to PostgreSQL conversation_messages |- 3. Generate updated summary (tier: fast) -> updateSummary() '- 4. extractMemories() -> LLM extraction -> embeddings -> pgvector upsert

Streaming: For the HTTP server, superviseStream() returns an AsyncIterable<string> that gets converted to Server-Sent Events (SSE) in OpenAI format.


AI Gateway

Tier Abstraction

Components don’t request specific models. They request a tier:

TierOpenAIAnthropicBedrockUse
fastgpt-4o-miniclaude-haiku-4-5-20251001amazon.nova-lite-v1:0Summary generation, memory extraction, classification
standardgpt-4oclaude-sonnet-4-5-20250929anthropic.claude-sonnet-4-20250514-v1:0Supervisor, sub-agents
heavyo3claude-opus-4-6anthropic.claude-opus-4-20250514-v1:0Complex analysis

The exact mappings live in packages/engine/src/ai-gateway/config.ts.

The AI provider is configured per-instance via the admin panel (Settings tab). There is no global AI_PROVIDER env var. Individual instances can also override the model via the Management API.

Logging and Costs

Every LLM call is logged to the ai_logs table with:

  • Provider, model, tier
  • Token usage (prompt, completion, total)
  • Estimated cost in USD (calculated from per-token pricing)
  • Duration in ms
  • conversationId and instanceId for correlation

The logger uses a buffer with periodic flush to minimize DB writes.

LangSmith Tracing

LangSmith tracing is per-instance, configured via admin panel Settings tab (langsmithEnabled, API key, project name). Uses wrapAISDK from langsmith/experimental/vercel to wrap generateText/streamText at module level. Per-request tracing config is built via buildLangSmithProviderOptions() and passed as providerOptions.langsmith to the AI SDK calls. When providerOptions.langsmith is absent (LangSmith disabled), wrapped functions behave identically to the originals with zero overhead. Produces hierarchical traces: parent run per generateText/streamText call, with child runs for each LLM step and tool execution within maxSteps. Thread grouping via metadata.thread_id = conversationId. Instance filtering via metadata.instance_id. Client instances are cached per API key in-memory.


Agent System

Supervisor

The Supervisor is the system’s decision-making center. It receives the user message, the last 15 messages of history, and the conversation summary.

Configuration:

  • Tier: standard
  • Max steps: 15 (reasoning cycles)
  • System prompt: 8 modular sections stored in the instance_prompts table (per-instance, DB-backed). Defaults are seeded from packages/engine/src/instances/defaults.ts on instance creation
  • Available tools: filtered per-instance via the instance_tools table (auto-recomputed when skills change)

Tools shipped with the framework (non-exhaustive — the authoritative list is generated from the registry and lives in reference/tool-catalog.md; the actual set per instance is determined by instance_tools):

ToolDescriptionWhen to use
searchMemoryHybrid pgvector + PG FTS search across memories and conversationsProactively for any question about past facts
saveMemoryExplicit memory save to pgvectorOnly on user request
webSearchWeb search via Tavily APIFor external/current information
httpRequest / curlGeneric HTTP requestFetch pages, APIs, JSON, RSS
updateSoulModify the assistant’s personality (section 02-soul)Only on user request
updateUserProfileUpdate the user profile (section 07-user-identity)When user shares personal info
readSkillLoad a skill’s instructions on-demandWhen the supervisor needs to apply a skill
readFile / writeFile / listDirectoryFilesystem access inside the workspaceWhen operating on files inside an ephemeral workspace
searchKnowledge / getKnowledge / writeKnowledgeKnowledge base operationsWhen working with the instance’s knowledge files
gitCloneRepoClone a GitHub repo into the workspaceCode-touching workflows
ghIssue / ghPRGitHub issue + pull request operationsGitHub integrations
HubSpot suite (8 tools)hubspotContact, hubspotDeal, hubspotMeeting, hubspotNote, hubspotTicket, hubspotCreateTask, hubspotSendEmail, hubspotGetCompanyHubSpot CRM workflows
slackPostMessagePost to a Slack channel or DM via the instance’s Slack credentialsOutbound Slack messages
send_whatsapp_templateSend a Twilio-approved WhatsApp templateWhatsApp 24h-window outbound
send_outbound_messageSend a message to any configured outbound channelChannel-agnostic outbound
scheduleTaskCreate a cron / interval / one-shot taskTime-based automation
fileUploadUpload an attachment to S3Attachment workflows
verifyDocumentValidate a document via a tool-level LLM callDocument QA
Room harness toolsmark_events_completed, compact_room_history, send_message_to_humanInjected only inside Room cycles
spawnTaskDelegate to an isolated sub-agent in the same instanceMulti-step tasks within one instance

Channel-specific and integration-specific tools are also included in the framework and can be enabled per-instance.

Sub-Agent System

Sub-agents are isolated agents that the Supervisor can delegate tasks to via spawnTask. They receive all enabled tools except spawnTask, to prevent infinite recursion.

There is no dedicated sub-agents/ directory or SubAgentDefinition type today — the spawnTask tool (packages/engine/src/agents/tools/task-tool.ts) creates ad-hoc sub-agents on the fly with a generic system prompt and the parent’s filtered tool set.

Note: spawnTask is unrelated to agent-to-agent calls between different instances, which go through the agent channel adapter (channels/adapters/agent.adapter.ts). See agents.md for the disambiguation.


Memory System

Architecture

The memory system is fully native — no external services required beyond PostgreSQL with pgvector.

ComponentRole
LLM Extractor (extractor.ts)Sends recent messages to the project’s LLM (tier: fast) for structured fact extraction
Embedder (embedder.ts)Generates embeddings via OpenAI (text-embedding-3-small)
Memory Store (memory-store.ts)Upserts memories into pgvector with cosine similarity deduplication (threshold 0.90)
PostgreSQL FTSFull-text search on raw conversations (conversation_messages table with tsvector column)

Memory Flow

User conversation <-> Supervisor | v (fire-and-forget after response) extractMemories(conversationId, instanceId) | v Load last 15 messages from PostgreSQL | v Send transcript to LLM (tier: fast) | v LLM returns JSON: [{content, category, importance}] | v Generate embeddings (OpenAI text-embedding-3-small) | v Upsert each memory into pgvector: - Cosine similarity check against existing memories - If similarity > 0.90: update existing memory - Otherwise: insert new memory

Categories: preference, fact, event, relationship, decision, general Importance: 1-10 scale (10 = critical life fact, 1 = trivial)

Hybrid Search (RRF)

Search combines two backends via Reciprocal Rank Fusion:

searchMemory(query) | +-----------+-----------+ | | pgvector semantic search PostgreSQL FTS (extracted memories) (raw conversations) cosine similarity websearch_to_tsquery top 20 results top 20 results | | +-----------+-----------+ | Reciprocal Rank Fusion score = Σ(1 / (k + rank + 1)) k = 60 | Sort + dedup + top N | HybridSearchResult[] {content, type, score, source, createdAt}

Why RRF: The two backends produce scores on different scales (cosine similarity vs ts_rank). RRF implicitly normalizes based only on ranking position, not absolute values.

Complementarity:

  • Semantic (pgvector): finds memories by meaning, even with different words
  • Keyword (PG FTS): finds conversations by exact words, proper nouns, dates

Note: Embeddings always use OpenAI (text-embedding-3-small) — the per-instance openai_api_key secret is required regardless of the instance’s AI provider. Anthropic does not offer an embedding API. The extraction LLM uses the configured provider via ai-gateway.


Conversation System

PostgreSQL Tables

conversations:

  • conversationId (text, unique) - format channelType:channelId
  • summary (text, nullable) - updated after each turn
  • instanceId (text, nullable)
  • createdAt, updatedAt

conversation_messages (with Full-Text Search):

  • conversationId (text) - indexed
  • role (user|assistant)
  • content (text)
  • toolCalls (jsonb, nullable)
  • search_vector (tsvector) - auto-generated from content (config: simple, managed via SQL migration)
  • createdAt - indexed
  • GIN index on search_vector

ConversationStore

// Summary management (in-memory cache) getSummary(conversationId) // cache hit -> return; miss -> query DB -> cache updateSummary(conversationId) // write DB + update cache ensureConversation(id, instanceId) // INSERT ... ON CONFLICT DO NOTHING // Message management appendMessages(conversationId, messages[]) // batch insert getRecentMessages(conversationId, limit=15) // last N messages (chronological) // Full-text search searchConversations(query, {instanceId, limit, offset}) // websearch_to_tsquery + ts_rank // Listing listConversations({instanceId, limit, offset}) // paginated with instance JOIN getConversation(conversationId) // single conversation detail deleteConversation(conversationId) // delete conversation + messages

Summary Generation

After each response, an LLM (tier: fast) generates an updated conversation summary in 2-3 sentences. The previous summary is provided as context. This summary is injected into the Supervisor’s system prompt in subsequent conversations.


Channel Layer

Three adapters with a common ChannelAdapter interface:

ChannelPackageReceptionNotes
TelegramgrammYLong pollingMarkdown parse mode, optional user ID whitelist
Slack@slack/boltSocket ModeThread awareness, rich metadata
WhatsAppTwilio Programmable MessagingInbound webhooktwilio-client.ts posts outbound; media (MediaUrl) downloaded and re-uploaded to S3
Agent(in-process)Synchronous callagent.adapter.ts; carries callerSlug, callerConversationId, depth, parentTraceId

Channels are DB-driven and per-instance. Config is stored encrypted in the instance_channels table. The channel manager starts/stops adapters dynamically via admin panel or API. No global env vars for channel credentials.


Webhooks & Event Sources

External systems (HubSpot, custom APIs, etc.) can trigger agent actions via webhooks. A webhook is received at POST /webhooks/:webhookToken (always returns 200 OK immediately, processing is fire-and-forget). The payload (max 64KB) is passed to a webhook matcher LLM (tier: fast) which sequentially evaluates matching definitions (first match wins). Matched events are inserted into the event_backlog table with pending status, where they await processing by the Room scheduler.

Key components:

  • webhook-engine.ts — Central dispatch and queue management
  • webhook-matcher.ts — LLM-based event classification against priority-ordered definitions
  • webhook-sources.store.ts — Event source CRUD (with AES-256-GCM encrypted config)
  • webhook-backlog.store.ts — Pending event queue with status lifecycle (pending → processing → completed)
  • webhook.validators.ts, webhooks.schema.ts — Payload validation and schemas
  • template-renderer.ts, trigger-context.ts — Context building for interpretation
  • active-triggers.ts, webhook-logger.ts — Trigger state and audit logging

Rate limit: 60 events/min. Backlog capacity: 100 pending events per instance (excess dropped).


Scheduled Tasks

The scheduler provides time-based and event-based task execution with optional result delivery via outbound channels.

Schema:

  • schedule_type: cron (recurring), interval (every N seconds), or oneShot (run once then delete)
  • status: enabled, disabled (suspended), error (automatically disabled after N consecutive failures)
  • retry_count, retry_delay_ms, max_consecutive_errors — Backoff and auto-disable logic
  • result_channel, result_target — Optional outbound channel for delivery (e.g., Slack, WhatsApp, email)

Key components:

  • scheduler.service.ts — Singleton scheduler with per-task queuing and tick-based evaluation
  • store.ts — Task CRUD and status management
  • schedule-utils.ts — Cron parsing and interval calculation
  • run-log.store.ts — Execution history (run_id, start_time, duration, status, output)

Tick interval is configurable. Each completed run can send its result to a configured channel (e.g., notification via Slack). Automatic disable occurs after max_consecutive_errors consecutive failures.


Activity Stream (SSE)

The activity stream provides real-time updates to the admin panel via Server-Sent Events (SSE), replacing the prior polling-based feed. Emits structured events from the agent pipeline: inbound/outbound messages, tool calls, memory extraction, scheduled-task fires, webhook matches, and agent-to-agent handoffs.

Components:

  • bus-emitter.ts — Event bus with subscription/unsubscription
  • event-formatters.ts — Transforms domain events into SSE payloads
  • Controller endpoint — GET /api/activity-stream/live with client-managed reconnection

Used by the admin panel’s activity dashboard for live visibility without polling overhead.


Attachment Pipeline

Messages can carry non-text payloads (images, documents, audio) from inbound channels. WhatsApp media (via Twilio MediaUrl) is downloaded, validated, and uploaded to S3. Outbound attachment references are served via a reverse proxy endpoint or signed S3 URLs to prevent direct exposure.

Data flow:

  1. Channel adapter (WhatsApp) extracts MediaUrl and metadata from inbound message
  2. Binary blob is uploaded to S3 via platform-storage.ts when PLATFORM_S3_* env vars are configured (otherwise the attachment is skipped or left as a remote reference)
  3. Attachment metadata (key, mimeType, size, etc.) is appended to the attachments JSONB column on the corresponding conversation_messages row
  4. Outbound delivery uses signed S3 URLs

Key components:

  • packages/engine/src/attachments/platform-storage.ts — S3 helpers (upload + signed URL generation). No dedicated controller, no message_attachments table.
  • Attachment storage — jsonb attachments column on conversation_messages (see packages/engine/src/conversations/schema.ts); each entry holds the S3 key plus metadata.

HTTP Server (NestJS)

OpenAI-Compatible API

MethodPathDescription
GET/healthHealth check
GET/v1/modelsList instances as models
POST/v1/chat/completionsChat completion (sync and streaming SSE)

Management API — Instances

MethodPathDescription
GET/api/instancesList all instances
POST/api/instancesCreate instance
GET/api/instances/modelsList available providers and models
GET/api/instances/:slugGet instance by slug
PATCH/api/instances/:slugUpdate instance
DELETE/api/instances/:slugDelete instance + workspace
GET/api/instances/:slug/promptsGet prompt sections
PATCH/api/instances/:slug/promptsUpdate prompt sections
GET/api/instances/:slug/toolsGet tools with enabled status
PATCH/api/instances/:slug/toolsUpdate enabled tools
GET/api/instances/:slug/skillsGet skills with enabled/env status
PATCH/api/instances/:slug/skillsUpdate enabled skills
GET/api/instances/:slug/skills/:skillSlug/envGet skill env vars
PUT/api/instances/:slug/skills/:skillSlug/envSet skill env vars
DELETE/api/instances/:slug/skills/:skillSlug/env/:keyDelete skill env var

Management API — Conversations

MethodPathDescription
GET/api/conversationsList conversations (paginated, filterable, searchable)
GET/api/conversations/:idGet conversation detail
GET/api/conversations/:id/messagesGet conversation messages (paginated)
DELETE/api/conversations/:idDelete conversation

Management API — Secrets

MethodPathDescription
GET/api/instances/:slug/secretsList secret keys + configured status (never values)
PUT/api/instances/:slug/secretsBulk upsert secrets
DELETE/api/instances/:slug/secrets/:keyDelete secret

Management API — Channels

MethodPathDescription
GET/api/instances/:slug/channelsList channel configs
PUT/api/instances/:slug/channels/:typeSet channel config
DELETE/api/instances/:slug/channels/:typeDelete channel config

Management API — Analytics

MethodPathDescription
GET/api/analyticsGlobal analytics (KPIs, token usage, latency)
GET/api/instances/:slug/analyticsPer-instance analytics (incl. latency)

Management API — Memories

MethodPathDescription
GET/memoriesList memories (paginated, searchable)
POST/memoriesCreate memory
DELETE/memories/:idDelete memory
DELETE/memoriesDelete all memories (with optional instanceId filter)

Management API — Skills (Global Library)

MethodPathDescription
GET/api/skillsList all skills in the global library
GET/api/skills/:nameGet skill by name
POST/api/skillsCreate skill
PUT/api/skills/:nameUpdate skill
DELETE/api/skills/:nameDelete skill

Authentication

Auth is per-instance. If authEnabled is set on the instance, calls to /v1/* require Authorization: Bearer <auth_api_key> (configured via admin panel Settings tab). If authEnabled is false, access is open for that instance.


Infrastructure (Docker Compose)

services: postgres: # PostgreSQL 16 (pgvector), port 5432

Persistent volume: polyant-pgdata (PostgreSQL).


Configuration (.env)

Only infrastructure variables remain in .env. AI provider keys, LangSmith, auth, Tavily, and channel credentials are configured per-instance via the admin panel Settings/Channels tabs.

# Database POSTGRES_HOST=localhost POSTGRES_PORT=5432 POSTGRES_DB=polyant POSTGRES_USER=polyant POSTGRES_PASSWORD=changeme # HTTP Server API_PORT=4000 # Encryption ENCRYPTION_KEY=... # 32-byte hex key for AES-256-GCM (instance secrets) # Instance DEFAULT_INSTANCE_ID=default # default instanceId for single-instance setup

Design Patterns and Technical Choices

1. Native Memory Extraction

Memory extraction runs entirely in-process: the project’s LLM (tier: fast) extracts structured facts from conversation transcripts, OpenAI generates embeddings, and pgvector stores them with cosine similarity deduplication. No external memory services required.

2. Hybrid Search with RRF

Search combines pgvector (semantic) and PostgreSQL FTS (keyword) via Reciprocal Rank Fusion. This approach implicitly normalizes heterogeneous scores based only on rankings.

3. Real-time Extraction (not Batch)

After each supervisor response, a fire-and-forget process extracts memories from the last 15 messages. No scheduled jobs, no nightly batches.

4. PostgreSQL for Conversation Storage

All messages are saved in PostgreSQL (conversation_messages). The auto-generated tsvector column enables full-text search without additional components.

5. Tier Abstraction (not Model Binding)

Components request a tier (fast, standard, heavy), not a specific model. The tier-to-model mapping is centralized in packages/engine/src/ai-gateway/config.ts. Instances can override provider/model for fine-grained control.

6. Pipeline Latency Tracing

Every user message (excluding auto-tasks like Open WebUI title/summary generation) is instrumented with per-phase timing. The TraceStore buffers entries and flushes to pipeline_traces in batches (every 10 entries or every 5 seconds), following the same pattern as AILogger.

Phases tracked: context prep, tool building, LLM call, total. Additional data: individual tool call durations (toolCallTraces JSONB), streaming TTFB, token counts.

The pipeline_traces table is separate from ai_logs: ai_logs tracks individual LLM API calls (including background tasks like summary generation), while pipeline_traces tracks end-to-end pipeline latency for user-facing messages only.

Analytics queries in latency.store.ts use PostgreSQL percentile_cont() for p50/p95/p99 and jsonb_array_elements() for tool call breakdown. Results are served alongside existing analytics via both global and per-instance endpoints.

7. Fire-and-Forget Post-Processing

The afterResponse() function runs asynchronously without blocking the response: saves messages, updates summary, extracts memories. If it fails, the error is logged but the user has already received the response.

8. OpenAI-Compatible API

The server exposes endpoints in OpenAI format (/v1/chat/completions, /v1/models). Compatible with Open WebUI, ChatBox, and any OpenAI-compatible client.

9. Self-Registering Tools

Tools are defined as *.tool.ts files that call registerTool() at module level. They are auto-discovered at boot by loadAllTools(). No hardcoded imports or manual wiring needed. Each tool can declare requiredEnv — if the env var is missing, the tool is silently excluded.

10. Instance Personalization (Database-First)

All instance configuration — prompts, skills, tool enablement, secrets, channel credentials, and knowledge documents — is stored in PostgreSQL, not on the filesystem. New instances are seeded from the defaults defined in packages/engine/src/instances/defaults.ts. The workspaces/<instanceId>/ directory exists only as a sandbox root for per-conversation tool work (readFile / writeFile / gitCloneRepo under conversations/<convId>/); it is never the source of truth for any agent configuration.

11. Encrypted Skill Environment Variables

Skills can declare requiredEnv in YAML frontmatter. Values are encrypted with AES-256-GCM per-instance in the instance_skill_env table and injected at runtime when the agent reads the skill.


Main Dependencies

PackageVersionRole
ai^4.0.0Vercel AI SDK (core)
@ai-sdk/openai^1.0.0OpenAI provider
@ai-sdk/anthropic^1.0.0Anthropic provider
@nestjs/core^11.1.13HTTP server
drizzle-orm^0.38.0PostgreSQL ORM
grammy^1.40.0Telegram bot
@slack/bolt^4.6.0Slack bot
@tavily/core^0.0.3Web search
langsmith^0.3.0LLM tracing
zod^3.23.0Schema validation

Commands

All commands run from the monorepo root and delegate to the appropriate workspace:

# Engine (AI runtime) npm run dev # Start engine with tsx watch npm run dev:engine # Same as above (explicit) npm run build:engine # Build engine only npm start # Run engine from dist/ # Web (admin panel) npm run dev:web # Start Next.js dev server npm run build:web # Build web only # All workspaces npm run build # Build all packages npm run lint # ESLint all packages npm run typecheck # TypeScript check all packages npm test # Run all tests # Database (engine) npm run db:generate # Generate Drizzle migrations npm run db:migrate # Apply migrations npm run db:studio # Drizzle Studio GUI # Engine tests npm run test:unit # Unit tests only npm run test:integration # Integration tests only npm run test:functional # Functional tests only # Docker docker compose up -d # Start postgres + open-webui

Room & Event Sources

The Room system enables proactive, event-driven agent behavior — the agent doesn’t just respond to user messages, it listens to external system events and takes initiative.

Data Flow

External System (HubSpot, etc.) POST /webhooks/:webhookToken ← always returns 200 OK (fire-and-forget) WebhookController.processEvent() ← validation cascade: │ token → source enabled → room enabled │ → backlog cap (100) → definitions → slug Event Matcher (LLM tier: fast) ← sequential, first-match-wins Event Backlog (status: pending) ← queue in PostgreSQL Room Scheduler (30s tick) ← per-room mutex, parallel across rooms Room Engine (executeRoomCycle) ← builds synthetic message with: │ pending events + interpretation prompts │ + human message (if any) Supervisor (standard LLM call) ← same supervisor as user messages, │ memory disabled for room cycles Outbound Channel ← Slack / WhatsApp / Telegram Human ← reply routes back via triggerImmediate()

Key Design Decisions

  • One Room per instance: stored in instance_room (1:1 with instances). Room prompt defines the agent’s mandate for event processing.
  • Event Sources are generic: source_type field supports extensible source types (currently hubspot, webhook). Config is AES-256-GCM encrypted.
  • Event Definitions are priority-ordered: sequential LLM evaluation, first match wins. Each definition has a matchingPrompt (for the classifier) and an interpretationPrompt (injected into the Room cycle).
  • Activity log auto-compacts: daily entries kept 7 days → merged into weekly (4 weeks) → merged into monthly (12 months) → deleted.
  • Room scheduler is a singleton: per-room mutex prevents concurrent processing of the same room. Different rooms run in parallel.

Database Tables

TablePurpose
instance_roomRoom config: prompt, outbound channel/target, intervals, rate limit
event_sourcesExternal event connectors with encrypted config and webhook token
event_definitionsMatching + interpretation prompts per event type
event_backlogQueue of matched events: pending → processing → completed
room_activity_logTime-decaying activity chronicle with auto-compaction

API Endpoints

EndpointPurpose
GET/PUT/DELETE /api/instances/:slug/roomRoom CRUD
GET/POST /api/instances/:slug/event-sourcesEvent source list (with definitions inline) + create
PUT/DELETE /api/instances/:slug/event-sources/:idUpdate/delete event source
POST /api/instances/:slug/event-sources/:id/rotate-tokenRotate webhook token
GET/POST /api/instances/:slug/event-sources/:id/definitionsDefinition list + create
PUT/DELETE /api/instances/:slug/event-sources/:id/definitions/:defIdUpdate/delete definition
POST /webhooks/:webhookTokenExternal event ingestion

Extension Points

WhatHow
New channelImplement ChannelAdapter, register it in packages/engine/src/index.ts
New sub-agentSee AGENTS.md  for architecture and extension patterns
New AI providerImplement ProviderAdapter, add it to packages/engine/src/ai-gateway/index.ts
New supervisor toolCreate a *.tool.ts file calling registerTool() in packages/engine/src/agents/tools/ — the registry picks it up at boot and the tools DB table is synced automatically. Per-instance enablement is then controlled via instance_tools (admin panel or PATCH /api/instances/:slug/tools)
New LLM tierExtend ModelTier and mapping in packages/engine/src/ai-gateway/config.ts
New global skillAdd a row to skills + skill_versions via the management API (POST /api/skills) or insert directly. Per-instance assignment is then controlled via instance_skills
New event source typeAdd config schema in eventSourceConfigSchemas (event-sources.store.ts), webhook payload handled by generic matcher
Last updated on