Polyant — Architecture

Monorepo Structure

This project is an npm workspaces monorepo with two packages:

Package	Path	Role
`@polyant/engine`	`packages/engine/`	NestJS server — AI runtime + management API
`@polyant/web`	`packages/web/`	Next.js App Router — admin panel

The root package.json orchestrates both packages. Infrastructure (Docker Compose, .env) lives at the monorepo root.

Overview

Polyant is an open-source platform for building AI assistants with long-term memory, multi-channel support, and per-instance customization. A single instance of the engine can host any number of independently-configured assistants (prompts, skills, tools, channels, secrets), each addressable as an OpenAI-compatible model. Every user interaction is routed through a Supervisor agent that orchestrates tools, sub-agents, and memory to produce contextual and proactive responses.

Tech Stack

Component	Technology
Language	TypeScript / Node.js (ESM)
Agent Framework	Vercel AI SDK v4
LLM	Provider-agnostic via AI Gateway (OpenAI + Anthropic + Bedrock)
Database	PostgreSQL 16 (Drizzle ORM) + pgvector + Full-Text Search (tsvector)
Memory	Native LLM extraction + pgvector (cosine similarity) + PostgreSQL FTS
Search	Hybrid: pgvector semantic + PostgreSQL FTS keyword, fused with RRF
Web Research	Tavily API
Encryption	AES-256-GCM (Node.js crypto)
Tracing	LangSmith
Channels	Telegram (grammY), Slack (Bolt), WhatsApp (Twilio Programmable Messaging)
HTTP Server	NestJS (OpenAI-compatible API + Management REST API)
Admin Panel	Next.js 15, React 19, Tailwind CSS 4, shadcn/ui
Infrastructure	Docker Compose

Architecture


+------------------------------------------------------------+
|                   HTTP SERVER (NestJS)                      |
|   /v1/chat/completions  |  /v1/models  |  /health          |
|   /api/instances  |  /api/conversations  |  /api/skills     |
+------------------------------------------------------------+
|                   CHANNEL LAYER                             |
|   Telegram | Slack | WhatsApp (Twilio) | Web (REST/SSE)     |
|   Room | Scheduled | Agent (in-process)                     |
|        ChannelAdapter abstraction                           |
+------------------------------------------------------------+
|                   AGENT LAYER                               |
|   Supervisor (tier: standard, max 15 steps)                 |
|     Tools: searchMemory | webSearch | curl/httpRequest |    |
|            saveMemory | readSkill | readFile/writeFile |    |
|            HubSpot suite | GitHub | Slack/WhatsApp out |    |
|            scheduleTask | spawnTask | ...                   |
|   Sub-agents: ad-hoc via spawnTask                          |
|   Agent-to-agent: via `agent` channel adapter (not spawnTask)|
+------------------------------------------------------------+
|                   MEMORY LAYER                              |
|   pgvector (cosine similarity) — extracted memories         |
|   PostgreSQL Full-Text Search (keyword) — conversations     |
|   Hybrid Search: RRF (Reciprocal Rank Fusion)              |
|   Native LLM extraction post-response (fire-and-forget)    |
+------------------------------------------------------------+
|                   AI GATEWAY                                |
|   Tier abstraction: fast | standard | heavy                 |
|   Provider: OpenAI | Anthropic | Bedrock                    |
|   Per-instance provider/model override                      |
|   Logging: tokens, costs, latency -> PostgreSQL             |
|   Tracing: LangSmith                                        |
+------------------------------------------------------------+
|                   DATA LAYER                                |
|   PostgreSQL (conversations, memories, instances,           |
|     instance_secrets, instance_channels, ai_logs,           |
|     pipeline_traces)                                        |
|   pgvector extension (memory embeddings)                    |
+------------------------------------------------------------+

Directory Structure


packages/engine/
  src/
    index.ts                              # Boot sequence
    config.ts                             # Configuration (Zod schema)
    workspace/
      index.ts                            # Workspace resolver (per-instance parameterization)
    ai-gateway/
      index.ts                            # Gateway init + chat/chatStream
      types.ts                            # ChatRequest, ChatResponse, ProviderAdapter
      config.ts                           # Model tier mappings + pricing
      logger.ts                           # AILogger (batch write to PostgreSQL)
      langsmith.ts                        # LangSmith tracing setup
      providers/
        openai.ts                         # OpenAI via @ai-sdk/openai
        anthropic.ts                      # Anthropic via @ai-sdk/anthropic
    agents/
      supervisor/
        index.ts                          # supervise() + superviseStream()
        prompt.ts                         # System prompt builder (DB-stored sections + skills + datetime)
      tools/
        registry.ts                       # Self-registration system + loadAllTools()
        tools-sync.ts                     # Catalogue reconciliation (orphan cascade)
        # ~35 *.tool.ts files — see reference/tool-catalog.md for the authoritative list
    activity-stream/                       # In-process pub/sub bus + SSE controller + LLM tap
    attachments/                           # Image/document/audio upload + S3 + signed URLs
    audit/                                 # Per-tool-call audit table and scoped AuditLogger
    knowledge/                             # Per-instance knowledge files (search/read/write tools)
    scheduled-tasks/                       # Cron / interval / one-shot scheduler
    stt-gateway/                           # Voice-note transcription gateway
      providers/
        openai-whisper.ts                  # Default Whisper provider (whisper-1)
        deepgram.ts                        # Alternative provider
        aws-transcribe.ts                  # Alternative provider
    users/                                 # Credentials provider controller + user-management routes + seed
    auth/                                  # NestJS auth module (guard, JWT decryption) + Auth.js v5 schema (users, accounts, sessions)
    workspace/                             # Workspace resolver (ephemeral per-instance dir)
    analytics/                            # Analytics module logic (no controller — endpoints live in server/analytics/)
      traces.schema.ts                    # Drizzle schema (pipeline_traces table)
      trace.store.ts                      # TraceStore: buffered fire-and-forget pipeline trace writer
      latency.store.ts                    # Latency analytics queries (percentiles, phase breakdown, tool stats)
    memory/
      index.ts                            # Entry point: initMemory() + re-exports
      types.ts                            # Memory, ExtractedFact interfaces
      embedder.ts                         # OpenAI embeddings (text-embedding-3-small)
      memory-store.ts                     # pgvector upsert + cosine similarity dedup
      hybrid-search.ts                    # Hybrid RRF search (pgvector + PG FTS)
      extractor.ts                        # Native LLM extraction post-response
      schema.ts                           # Drizzle schema (memories table with vector column)
    conversations/
      index.ts                            # Re-exports
      store.ts                            # ConversationStore (messages, summaries, FTS)
      schema.ts                           # Drizzle tables (conversations, conversation_messages)
      types.ts                            # Conversation interfaces
    instances/
      schema.ts                           # Drizzle schema (instances table)
      store.ts                            # Instance CRUD
      skill-env.schema.ts                 # Drizzle schema (instance_skill_env table)
      skill-env.store.ts                  # Encrypted skill env CRUD
      secrets.schema.ts                   # Drizzle schema (instance_secrets table)
      secrets.store.ts                    # Encrypted secrets CRUD
      channels.schema.ts                  # Drizzle schema (instance_channels table)
      channels.store.ts                   # Channel config CRUD
      config-resolver.ts                  # Per-instance config with 30s TTL cache
    skills/
      skills.service.ts                   # Global skill library management
      skills.controller.ts                # /api/skills CRUD endpoints
    crypto/
      index.ts                            # AES-256-GCM encrypt/decrypt
    channels/
      types.ts                            # ChannelAdapter, IncomingMessage, OutgoingMessage
      channel-manager.ts                  # Adapter orchestrator
      adapters/
        telegram/index.ts                 # grammY long polling
        slack/index.ts                    # @slack/bolt Socket Mode
        whatsapp/
          index.ts                        # WhatsApp adapter (webhook)
          twilio-client.ts                # Twilio Programmable Messaging client
    server/
      main.ts                             # NestJS bootstrap
      server.module.ts                    # Root module
      health/health.controller.ts         # GET /health
      openai/
        openai.controller.ts              # /v1/chat/completions, /v1/models
        openai.service.ts                 # Chat completion logic
        openai.types.ts                   # Request/response OpenAI types
        openai.module.ts                  # NestJS module
      instances/instances.controller.ts   # /api/instances CRUD + prompts/tools/skills/secrets/channels
      conversations/conversations.controller.ts  # /api/conversations
      analytics/analytics.controller.ts   # /api/analytics + per-instance analytics
      memories/memories.controller.ts     # /memories CRUD
    database/
      client.ts                           # Drizzle connection
      migrations/                         # Generated migrations
    utils/
      pipeline-logger.ts                  # Structured pipeline logging
      frontmatter.ts                      # YAML frontmatter parser for skills
  workspaces/                             # Per-conversation tool sandboxes ONLY (gitignored)
    <instanceId>/                         # Knowledge lives in PostgreSQL (knowledge_documents + knowledge_chunks) — never here
      conversations/<convId>/             # Ephemeral scratch dir used by readFile / writeFile / gitCloneRepo

packages/web/
  src/
    app/
      globals.css                         # Design tokens
      layout.tsx                          # Root layout (Inter font, ThemeProvider, I18nProvider)
      (admin)/
        layout.tsx                        # Sidebar + header layout
        page.tsx                          # Dashboard
        instances/                        # Instance management (list, detail, tabs)
        conversations/                    # Conversation browsing + search
        skills/                           # Global skill library CRUD
        playground/page.tsx                # Playground chat page
        memory/page.tsx                   # Memory management (list, search, create/delete)
        settings/page.tsx                 # Global settings
    components/
      layout/                             # Sidebar, header, nav, theme/lang toggles
      analytics/                          # Analytics chart components (KPIs, trends, latency)
      ui/                                 # shadcn/ui components
    lib/
      api.ts                              # API client for engine
      utils.ts                            # cn() helper
      i18n/                               # Italian/English internationalization
    hooks/                                # use-mobile, etc.

Boot Sequence (`packages/engine/src/index.ts`)

At startup, the system executes in order:

AI Gateway - Initializes logging
Trace Store - Initializes pipeline latency trace writer (buffered, periodic flush)
Tool Loading - Auto-discovers and registers all *.tool.ts files
Memory - Verifies pgvector extension is available
NestJS Server - OpenAI-compatible API + Management API on configurable port
Channel Adapters - Loads enabled channel configs from DB per active instance, starts adapters dynamically

Request Flow


User message
    |
    v
[Channel Adapter] normalizes to IncomingMessage
    |
    v
[Pre-enrichment]
  |- Load conversation summary (in-memory cache -> PostgreSQL)
  '- Create conversation row if missing (fire-and-forget)
    |
    v
[Conversation History] getRecentMessages(conversationId, 15) from PostgreSQL
    |
    v
[Instance Resolution] resolveInstance(instanceId) -> load config from PostgreSQL (cache 30s)
    |
    v
[Supervisor] (tier: standard, max 15 steps)
  |- System prompt: 8 sections from instance_prompts (per-instance, DB-stored)
  |- Skills: discovered from skills + skill_versions + instance_skills (DB joins)
  |- Last 15 messages + new message
  '- Available tools (filtered per-instance via instance_tools):
       searchMemory   -> hybrid pgvector + PG FTS + RRF search
       webSearch      -> web search (Tavily, optional)
       saveMemory     -> explicit save (only on user request)
       updateSoul     -> modify personality
       updateUserProfile -> update user info
       readSkill      -> load a skill's instructions
       spawnTask      -> delegate to sub-agent
       ...            -> channel-specific and integration-specific tools
    |
    v
[Response] sent to channel
    |
    v
[After Response] (async, fire-and-forget)
  |- 1. traceStore.record() -> pipeline latency trace (phase breakdown + tool timings)
  |- 2. appendMessages() -> save to PostgreSQL conversation_messages
  |- 3. Generate updated summary (tier: fast) -> updateSummary()
  '- 4. extractMemories() -> LLM extraction -> embeddings -> pgvector upsert

Streaming: For the HTTP server, superviseStream() returns an AsyncIterable<string> that gets converted to Server-Sent Events (SSE) in OpenAI format.

AI Gateway

Tier Abstraction

Components don’t request specific models. They request a tier:

Tier	OpenAI	Anthropic	Bedrock	Use
`fast`	gpt-4o-mini	claude-haiku-4-5-20251001	amazon.nova-lite-v1:0	Summary generation, memory extraction, classification
`standard`	gpt-4o	claude-sonnet-4-5-20250929	anthropic.claude-sonnet-4-20250514-v1:0	Supervisor, sub-agents
`heavy`	o3	claude-opus-4-6	anthropic.claude-opus-4-20250514-v1:0	Complex analysis

The exact mappings live in packages/engine/src/ai-gateway/config.ts.

The AI provider is configured per-instance via the admin panel (Settings tab). There is no global AI_PROVIDER env var. Individual instances can also override the model via the Management API.

Logging and Costs

Every LLM call is logged to the ai_logs table with:

Provider, model, tier
Token usage (prompt, completion, total)
Estimated cost in USD (calculated from per-token pricing)
Duration in ms
conversationId and instanceId for correlation

The logger uses a buffer with periodic flush to minimize DB writes.

LangSmith Tracing

LangSmith tracing is per-instance, configured via admin panel Settings tab (langsmithEnabled, API key, project name). Uses wrapAISDK from langsmith/experimental/vercel to wrap generateText/streamText at module level. Per-request tracing config is built via buildLangSmithProviderOptions() and passed as providerOptions.langsmith to the AI SDK calls. When providerOptions.langsmith is absent (LangSmith disabled), wrapped functions behave identically to the originals with zero overhead. Produces hierarchical traces: parent run per generateText/streamText call, with child runs for each LLM step and tool execution within maxSteps. Thread grouping via metadata.thread_id = conversationId. Instance filtering via metadata.instance_id. Client instances are cached per API key in-memory.

Agent System

Supervisor

The Supervisor is the system’s decision-making center. It receives the user message, the last 15 messages of history, and the conversation summary.

Configuration:

Tier: standard
Max steps: 15 (reasoning cycles)
System prompt: 8 modular sections stored in the instance_prompts table (per-instance, DB-backed). Defaults are seeded from packages/engine/src/instances/defaults.ts on instance creation
Available tools: filtered per-instance via the instance_tools table (auto-recomputed when skills change)

Tools shipped with the framework (non-exhaustive — the authoritative list is generated from the registry and lives in reference/tool-catalog.md; the actual set per instance is determined by instance_tools):

Tool	Description	When to use
`searchMemory`	Hybrid pgvector + PG FTS search across memories and conversations	Proactively for any question about past facts
`saveMemory`	Explicit memory save to pgvector	Only on user request
`webSearch`	Web search via Tavily API	For external/current information
`httpRequest` / `curl`	Generic HTTP request	Fetch pages, APIs, JSON, RSS
`updateSoul`	Modify the assistant’s personality (section 02-soul)	Only on user request
`updateUserProfile`	Update the user profile (section 07-user-identity)	When user shares personal info
`readSkill`	Load a skill’s instructions on-demand	When the supervisor needs to apply a skill
`readFile` / `writeFile` / `listDirectory`	Filesystem access inside the workspace	When operating on files inside an ephemeral workspace
`searchKnowledge` / `getKnowledge` / `writeKnowledge`	Knowledge base operations	When working with the instance’s knowledge files
`gitCloneRepo`	Clone a GitHub repo into the workspace	Code-touching workflows
`ghIssue` / `ghPR`	GitHub issue + pull request operations	GitHub integrations
HubSpot suite (8 tools)	`hubspotContact`, `hubspotDeal`, `hubspotMeeting`, `hubspotNote`, `hubspotTicket`, `hubspotCreateTask`, `hubspotSendEmail`, `hubspotGetCompany`	HubSpot CRM workflows
`slackPostMessage`	Post to a Slack channel or DM via the instance’s Slack credentials	Outbound Slack messages
`send_whatsapp_template`	Send a Twilio-approved WhatsApp template	WhatsApp 24h-window outbound
`send_outbound_message`	Send a message to any configured outbound channel	Channel-agnostic outbound
`scheduleTask`	Create a cron / interval / one-shot task	Time-based automation
`fileUpload`	Upload an attachment to S3	Attachment workflows
`verifyDocument`	Validate a document via a tool-level LLM call	Document QA
Room harness tools	`mark_events_completed`, `compact_room_history`, `send_message_to_human`	Injected only inside Room cycles
`spawnTask`	Delegate to an isolated sub-agent in the same instance	Multi-step tasks within one instance

Channel-specific and integration-specific tools are also included in the framework and can be enabled per-instance.

Sub-Agent System

Sub-agents are isolated agents that the Supervisor can delegate tasks to via spawnTask. They receive all enabled tools except spawnTask, to prevent infinite recursion.

There is no dedicated sub-agents/ directory or SubAgentDefinition type today — the spawnTask tool (packages/engine/src/agents/tools/task-tool.ts) creates ad-hoc sub-agents on the fly with a generic system prompt and the parent’s filtered tool set.

Note: spawnTask is unrelated to agent-to-agent calls between different instances, which go through the agent channel adapter (channels/adapters/agent.adapter.ts). See agents.md for the disambiguation.

Memory System

Architecture

The memory system is fully native — no external services required beyond PostgreSQL with pgvector.

Component	Role
LLM Extractor (`extractor.ts`)	Sends recent messages to the project’s LLM (tier: `fast`) for structured fact extraction
Embedder (`embedder.ts`)	Generates embeddings via OpenAI (`text-embedding-3-small`)
Memory Store (`memory-store.ts`)	Upserts memories into pgvector with cosine similarity deduplication (threshold 0.90)
PostgreSQL FTS	Full-text search on raw conversations (`conversation_messages` table with `tsvector` column)

Memory Flow


User conversation <-> Supervisor
         |
         v (fire-and-forget after response)
  extractMemories(conversationId, instanceId)
         |
         v
  Load last 15 messages from PostgreSQL
         |
         v
  Send transcript to LLM (tier: fast)
         |
         v
  LLM returns JSON: [{content, category, importance}]
         |
         v
  Generate embeddings (OpenAI text-embedding-3-small)
         |
         v
  Upsert each memory into pgvector:
    - Cosine similarity check against existing memories
    - If similarity > 0.90: update existing memory
    - Otherwise: insert new memory

Categories: preference, fact, event, relationship, decision, general Importance: 1-10 scale (10 = critical life fact, 1 = trivial)

Hybrid Search (RRF)

Search combines two backends via Reciprocal Rank Fusion:


                searchMemory(query)
                       |
           +-----------+-----------+
           |                       |
    pgvector semantic search PostgreSQL FTS
    (extracted memories)     (raw conversations)
    cosine similarity        websearch_to_tsquery
    top 20 results           top 20 results
           |                       |
           +-----------+-----------+
                       |
              Reciprocal Rank Fusion
              score = Σ(1 / (k + rank + 1))
              k = 60
                       |
              Sort + dedup + top N
                       |
              HybridSearchResult[]
              {content, type, score, source, createdAt}

Why RRF: The two backends produce scores on different scales (cosine similarity vs ts_rank). RRF implicitly normalizes based only on ranking position, not absolute values.

Complementarity:

Semantic (pgvector): finds memories by meaning, even with different words
Keyword (PG FTS): finds conversations by exact words, proper nouns, dates

Note: Embeddings always use OpenAI (text-embedding-3-small) — the per-instance openai_api_key secret is required regardless of the instance’s AI provider. Anthropic does not offer an embedding API. The extraction LLM uses the configured provider via ai-gateway.

Conversation System

PostgreSQL Tables

conversations:

conversationId (text, unique) - format channelType:channelId
summary (text, nullable) - updated after each turn
instanceId (text, nullable)
createdAt, updatedAt

conversation_messages (with Full-Text Search):

conversationId (text) - indexed
role (user|assistant)
content (text)
toolCalls (jsonb, nullable)
search_vector (tsvector) - auto-generated from content (config: simple, managed via SQL migration)
createdAt - indexed
GIN index on search_vector

ConversationStore


// Summary management (in-memory cache)
getSummary(conversationId)          // cache hit -> return; miss -> query DB -> cache
updateSummary(conversationId)       // write DB + update cache
ensureConversation(id, instanceId)  // INSERT ... ON CONFLICT DO NOTHING
 
// Message management
appendMessages(conversationId, messages[])      // batch insert
getRecentMessages(conversationId, limit=15)     // last N messages (chronological)
 
// Full-text search
searchConversations(query, {instanceId, limit, offset})  // websearch_to_tsquery + ts_rank
 
// Listing
listConversations({instanceId, limit, offset})  // paginated with instance JOIN
getConversation(conversationId)                 // single conversation detail
deleteConversation(conversationId)              // delete conversation + messages

Summary Generation

After each response, an LLM (tier: fast) generates an updated conversation summary in 2-3 sentences. The previous summary is provided as context. This summary is injected into the Supervisor’s system prompt in subsequent conversations.

Channel Layer

Three adapters with a common ChannelAdapter interface:

Channel	Package	Reception	Notes
Telegram	grammY	Long polling	Markdown parse mode, optional user ID whitelist
Slack	@slack/bolt	Socket Mode	Thread awareness, rich metadata
WhatsApp	Twilio Programmable Messaging	Inbound webhook	`twilio-client.ts` posts outbound; media (`MediaUrl`) downloaded and re-uploaded to S3
Agent	(in-process)	Synchronous call	`agent.adapter.ts`; carries `callerSlug`, `callerConversationId`, `depth`, `parentTraceId`

Channels are DB-driven and per-instance. Config is stored encrypted in the instance_channels table. The channel manager starts/stops adapters dynamically via admin panel or API. No global env vars for channel credentials.

Webhooks & Event Sources

External systems (HubSpot, custom APIs, etc.) can trigger agent actions via webhooks. A webhook is received at POST /webhooks/:webhookToken (always returns 200 OK immediately, processing is fire-and-forget). The payload (max 64KB) is passed to a webhook matcher LLM (tier: fast) which sequentially evaluates matching definitions (first match wins). Matched events are inserted into the event_backlog table with pending status, where they await processing by the Room scheduler.

Key components:

webhook-engine.ts — Central dispatch and queue management
webhook-matcher.ts — LLM-based event classification against priority-ordered definitions
webhook-sources.store.ts — Event source CRUD (with AES-256-GCM encrypted config)
webhook-backlog.store.ts — Pending event queue with status lifecycle (pending → processing → completed)
webhook.validators.ts, webhooks.schema.ts — Payload validation and schemas
template-renderer.ts, trigger-context.ts — Context building for interpretation
active-triggers.ts, webhook-logger.ts — Trigger state and audit logging

Rate limit: 60 events/min. Backlog capacity: 100 pending events per instance (excess dropped).

Scheduled Tasks

The scheduler provides time-based and event-based task execution with optional result delivery via outbound channels.

Schema:

schedule_type: cron (recurring), interval (every N seconds), or oneShot (run once then delete)
status: enabled, disabled (suspended), error (automatically disabled after N consecutive failures)
retry_count, retry_delay_ms, max_consecutive_errors — Backoff and auto-disable logic
result_channel, result_target — Optional outbound channel for delivery (e.g., Slack, WhatsApp, email)

Key components:

scheduler.service.ts — Singleton scheduler with per-task queuing and tick-based evaluation
store.ts — Task CRUD and status management
schedule-utils.ts — Cron parsing and interval calculation
run-log.store.ts — Execution history (run_id, start_time, duration, status, output)

Tick interval is configurable. Each completed run can send its result to a configured channel (e.g., notification via Slack). Automatic disable occurs after max_consecutive_errors consecutive failures.

Activity Stream (SSE)

The activity stream provides real-time updates to the admin panel via Server-Sent Events (SSE), replacing the prior polling-based feed. Emits structured events from the agent pipeline: inbound/outbound messages, tool calls, memory extraction, scheduled-task fires, webhook matches, and agent-to-agent handoffs.

Components:

bus-emitter.ts — Event bus with subscription/unsubscription
event-formatters.ts — Transforms domain events into SSE payloads
Controller endpoint — GET /api/activity-stream/live with client-managed reconnection

Used by the admin panel’s activity dashboard for live visibility without polling overhead.

Attachment Pipeline

Messages can carry non-text payloads (images, documents, audio) from inbound channels. WhatsApp media (via Twilio MediaUrl) is downloaded, validated, and uploaded to S3. Outbound attachment references are served via a reverse proxy endpoint or signed S3 URLs to prevent direct exposure.

Data flow:

Channel adapter (WhatsApp) extracts MediaUrl and metadata from inbound message
Binary blob is uploaded to S3 via platform-storage.ts when PLATFORM_S3_* env vars are configured (otherwise the attachment is skipped or left as a remote reference)
Attachment metadata (key, mimeType, size, etc.) is appended to the attachments JSONB column on the corresponding conversation_messages row
Outbound delivery uses signed S3 URLs

Key components:

packages/engine/src/attachments/platform-storage.ts — S3 helpers (upload + signed URL generation). No dedicated controller, no message_attachments table.
Attachment storage — jsonb attachments column on conversation_messages (see packages/engine/src/conversations/schema.ts); each entry holds the S3 key plus metadata.

HTTP Server (NestJS)

OpenAI-Compatible API

Method	Path	Description
GET	`/health`	Health check
GET	`/v1/models`	List instances as models
POST	`/v1/chat/completions`	Chat completion (sync and streaming SSE)

Management API — Instances

Method	Path	Description
GET	`/api/instances`	List all instances
POST	`/api/instances`	Create instance
GET	`/api/instances/models`	List available providers and models
GET	`/api/instances/:slug`	Get instance by slug
PATCH	`/api/instances/:slug`	Update instance
DELETE	`/api/instances/:slug`	Delete instance + workspace
GET	`/api/instances/:slug/prompts`	Get prompt sections
PATCH	`/api/instances/:slug/prompts`	Update prompt sections
GET	`/api/instances/:slug/tools`	Get tools with enabled status
PATCH	`/api/instances/:slug/tools`	Update enabled tools
GET	`/api/instances/:slug/skills`	Get skills with enabled/env status
PATCH	`/api/instances/:slug/skills`	Update enabled skills
GET	`/api/instances/:slug/skills/:skillSlug/env`	Get skill env vars
PUT	`/api/instances/:slug/skills/:skillSlug/env`	Set skill env vars
DELETE	`/api/instances/:slug/skills/:skillSlug/env/:key`	Delete skill env var

Management API — Conversations

Method	Path	Description
GET	`/api/conversations`	List conversations (paginated, filterable, searchable)
GET	`/api/conversations/:id`	Get conversation detail
GET	`/api/conversations/:id/messages`	Get conversation messages (paginated)
DELETE	`/api/conversations/:id`	Delete conversation

Management API — Secrets

Method	Path	Description
GET	`/api/instances/:slug/secrets`	List secret keys + configured status (never values)
PUT	`/api/instances/:slug/secrets`	Bulk upsert secrets
DELETE	`/api/instances/:slug/secrets/:key`	Delete secret

Management API — Channels

Method	Path	Description
GET	`/api/instances/:slug/channels`	List channel configs
PUT	`/api/instances/:slug/channels/:type`	Set channel config
DELETE	`/api/instances/:slug/channels/:type`	Delete channel config

Management API — Analytics

Method	Path	Description
GET	`/api/analytics`	Global analytics (KPIs, token usage, latency)
GET	`/api/instances/:slug/analytics`	Per-instance analytics (incl. latency)

Management API — Memories

Method	Path	Description
GET	`/memories`	List memories (paginated, searchable)
POST	`/memories`	Create memory
DELETE	`/memories/:id`	Delete memory
DELETE	`/memories`	Delete all memories (with optional instanceId filter)

Management API — Skills (Global Library)

Method	Path	Description
GET	`/api/skills`	List all skills in the global library
GET	`/api/skills/:name`	Get skill by name
POST	`/api/skills`	Create skill
PUT	`/api/skills/:name`	Update skill
DELETE	`/api/skills/:name`	Delete skill

Authentication

Auth is per-instance. If authEnabled is set on the instance, calls to /v1/* require Authorization: Bearer <auth_api_key> (configured via admin panel Settings tab). If authEnabled is false, access is open for that instance.

Infrastructure (Docker Compose)


services:
  postgres:     # PostgreSQL 16 (pgvector), port 5432

Persistent volume: polyant-pgdata (PostgreSQL).

Configuration (`.env`)

Only infrastructure variables remain in .env. AI provider keys, LangSmith, auth, Tavily, and channel credentials are configured per-instance via the admin panel Settings/Channels tabs.


# Database
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=polyant
POSTGRES_USER=polyant
POSTGRES_PASSWORD=changeme
 
# HTTP Server
API_PORT=4000
 
# Encryption
ENCRYPTION_KEY=...              # 32-byte hex key for AES-256-GCM (instance secrets)
 
# Instance
DEFAULT_INSTANCE_ID=default     # default instanceId for single-instance setup

Design Patterns and Technical Choices

1. Native Memory Extraction

Memory extraction runs entirely in-process: the project’s LLM (tier: fast) extracts structured facts from conversation transcripts, OpenAI generates embeddings, and pgvector stores them with cosine similarity deduplication. No external memory services required.

2. Hybrid Search with RRF

Search combines pgvector (semantic) and PostgreSQL FTS (keyword) via Reciprocal Rank Fusion. This approach implicitly normalizes heterogeneous scores based only on rankings.

3. Real-time Extraction (not Batch)

After each supervisor response, a fire-and-forget process extracts memories from the last 15 messages. No scheduled jobs, no nightly batches.

4. PostgreSQL for Conversation Storage

All messages are saved in PostgreSQL (conversation_messages). The auto-generated tsvector column enables full-text search without additional components.

5. Tier Abstraction (not Model Binding)

Components request a tier (fast, standard, heavy), not a specific model. The tier-to-model mapping is centralized in packages/engine/src/ai-gateway/config.ts. Instances can override provider/model for fine-grained control.

6. Pipeline Latency Tracing

Every user message (excluding auto-tasks like Open WebUI title/summary generation) is instrumented with per-phase timing. The TraceStore buffers entries and flushes to pipeline_traces in batches (every 10 entries or every 5 seconds), following the same pattern as AILogger.

Phases tracked: context prep, tool building, LLM call, total. Additional data: individual tool call durations (toolCallTraces JSONB), streaming TTFB, token counts.

The pipeline_traces table is separate from ai_logs: ai_logs tracks individual LLM API calls (including background tasks like summary generation), while pipeline_traces tracks end-to-end pipeline latency for user-facing messages only.

Analytics queries in latency.store.ts use PostgreSQL percentile_cont() for p50/p95/p99 and jsonb_array_elements() for tool call breakdown. Results are served alongside existing analytics via both global and per-instance endpoints.

7. Fire-and-Forget Post-Processing

The afterResponse() function runs asynchronously without blocking the response: saves messages, updates summary, extracts memories. If it fails, the error is logged but the user has already received the response.

8. OpenAI-Compatible API

The server exposes endpoints in OpenAI format (/v1/chat/completions, /v1/models). Compatible with Open WebUI, ChatBox, and any OpenAI-compatible client.

9. Self-Registering Tools

Tools are defined as *.tool.ts files that call registerTool() at module level. They are auto-discovered at boot by loadAllTools(). No hardcoded imports or manual wiring needed. Each tool can declare requiredEnv — if the env var is missing, the tool is silently excluded.

10. Instance Personalization (Database-First)

All instance configuration — prompts, skills, tool enablement, secrets, channel credentials, and knowledge documents — is stored in PostgreSQL, not on the filesystem. New instances are seeded from the defaults defined in packages/engine/src/instances/defaults.ts. The workspaces/<instanceId>/ directory exists only as a sandbox root for per-conversation tool work (readFile / writeFile / gitCloneRepo under conversations/<convId>/); it is never the source of truth for any agent configuration.

11. Encrypted Skill Environment Variables

Skills can declare requiredEnv in YAML frontmatter. Values are encrypted with AES-256-GCM per-instance in the instance_skill_env table and injected at runtime when the agent reads the skill.

Main Dependencies

Package	Version	Role
`ai`	^4.0.0	Vercel AI SDK (core)
`@ai-sdk/openai`	^1.0.0	OpenAI provider
`@ai-sdk/anthropic`	^1.0.0	Anthropic provider
`@nestjs/core`	^11.1.13	HTTP server
`drizzle-orm`	^0.38.0	PostgreSQL ORM
`grammy`	^1.40.0	Telegram bot
`@slack/bolt`	^4.6.0	Slack bot
`@tavily/core`	^0.0.3	Web search
`langsmith`	^0.3.0	LLM tracing
`zod`	^3.23.0	Schema validation

Commands

All commands run from the monorepo root and delegate to the appropriate workspace:


# Engine (AI runtime)
npm run dev              # Start engine with tsx watch
npm run dev:engine       # Same as above (explicit)
npm run build:engine     # Build engine only
npm start                # Run engine from dist/
 
# Web (admin panel)
npm run dev:web          # Start Next.js dev server
npm run build:web        # Build web only
 
# All workspaces
npm run build            # Build all packages
npm run lint             # ESLint all packages
npm run typecheck        # TypeScript check all packages
npm test                 # Run all tests
 
# Database (engine)
npm run db:generate      # Generate Drizzle migrations
npm run db:migrate       # Apply migrations
npm run db:studio        # Drizzle Studio GUI
 
# Engine tests
npm run test:unit        # Unit tests only
npm run test:integration # Integration tests only
npm run test:functional  # Functional tests only
 
# Docker
docker compose up -d     # Start postgres + open-webui

Room & Event Sources

The Room system enables proactive, event-driven agent behavior — the agent doesn’t just respond to user messages, it listens to external system events and takes initiative.

Data Flow


External System (HubSpot, etc.)
    │
    ▼
POST /webhooks/:webhookToken        ← always returns 200 OK (fire-and-forget)
    │
    ▼
WebhookController.processEvent()    ← validation cascade:
    │                                   token → source enabled → room enabled
    │                                   → backlog cap (100) → definitions → slug
    ▼
Event Matcher (LLM tier: fast)      ← sequential, first-match-wins
    │
    ▼
Event Backlog (status: pending)     ← queue in PostgreSQL
    │
    ▼
Room Scheduler (30s tick)           ← per-room mutex, parallel across rooms
    │
    ▼
Room Engine (executeRoomCycle)      ← builds synthetic message with:
    │                                   pending events + interpretation prompts
    │                                   + human message (if any)
    ▼
Supervisor (standard LLM call)      ← same supervisor as user messages,
    │                                   memory disabled for room cycles
    ▼
Outbound Channel                    ← Slack / WhatsApp / Telegram
    │
    ▼
Human                               ← reply routes back via triggerImmediate()

Key Design Decisions

One Room per instance: stored in instance_room (1:1 with instances). Room prompt defines the agent’s mandate for event processing.
Event Sources are generic: source_type field supports extensible source types (currently hubspot, webhook). Config is AES-256-GCM encrypted.
Event Definitions are priority-ordered: sequential LLM evaluation, first match wins. Each definition has a matchingPrompt (for the classifier) and an interpretationPrompt (injected into the Room cycle).
Activity log auto-compacts: daily entries kept 7 days → merged into weekly (4 weeks) → merged into monthly (12 months) → deleted.
Room scheduler is a singleton: per-room mutex prevents concurrent processing of the same room. Different rooms run in parallel.

Database Tables

Table	Purpose
`instance_room`	Room config: prompt, outbound channel/target, intervals, rate limit
`event_sources`	External event connectors with encrypted config and webhook token
`event_definitions`	Matching + interpretation prompts per event type
`event_backlog`	Queue of matched events: pending → processing → completed
`room_activity_log`	Time-decaying activity chronicle with auto-compaction

API Endpoints

Endpoint	Purpose
`GET/PUT/DELETE /api/instances/:slug/room`	Room CRUD
`GET/POST /api/instances/:slug/event-sources`	Event source list (with definitions inline) + create
`PUT/DELETE /api/instances/:slug/event-sources/:id`	Update/delete event source
`POST /api/instances/:slug/event-sources/:id/rotate-token`	Rotate webhook token
`GET/POST /api/instances/:slug/event-sources/:id/definitions`	Definition list + create
`PUT/DELETE /api/instances/:slug/event-sources/:id/definitions/:defId`	Update/delete definition
`POST /webhooks/:webhookToken`	External event ingestion

Extension Points

What	How
New channel	Implement `ChannelAdapter`, register it in `packages/engine/src/index.ts`
New sub-agent	See AGENTS.md for architecture and extension patterns
New AI provider	Implement `ProviderAdapter`, add it to `packages/engine/src/ai-gateway/index.ts`
New supervisor tool	Create a `*.tool.ts` file calling `registerTool()` in `packages/engine/src/agents/tools/` — the registry picks it up at boot and the `tools` DB table is synced automatically. Per-instance enablement is then controlled via `instance_tools` (admin panel or `PATCH /api/instances/:slug/tools`)
New LLM tier	Extend `ModelTier` and mapping in `packages/engine/src/ai-gateway/config.ts`
New global skill	Add a row to `skills` + `skill_versions` via the management API (`POST /api/skills`) or insert directly. Per-instance assignment is then controlled via `instance_skills`
New event source type	Add config schema in `eventSourceConfigSchemas` (`event-sources.store.ts`), webhook payload handled by generic matcher

Polyant — Architecture

Monorepo Structure

Overview

Tech Stack

Architecture

Directory Structure

Boot Sequence (packages/engine/src/index.ts)

Request Flow

AI Gateway

Tier Abstraction

Logging and Costs

LangSmith Tracing

Agent System

Supervisor

Sub-Agent System

Memory System

Architecture

Memory Flow

Hybrid Search (RRF)

Conversation System

PostgreSQL Tables

ConversationStore

Summary Generation

Channel Layer

Webhooks & Event Sources

Scheduled Tasks

Activity Stream (SSE)

Attachment Pipeline

HTTP Server (NestJS)

OpenAI-Compatible API

Management API — Instances

Management API — Conversations

Management API — Secrets

Management API — Channels

Management API — Analytics

Management API — Memories

Management API — Skills (Global Library)

Authentication

Infrastructure (Docker Compose)

Configuration (.env)

Design Patterns and Technical Choices

1. Native Memory Extraction

2. Hybrid Search with RRF

3. Real-time Extraction (not Batch)

4. PostgreSQL for Conversation Storage

5. Tier Abstraction (not Model Binding)

6. Pipeline Latency Tracing

7. Fire-and-Forget Post-Processing

8. OpenAI-Compatible API

9. Self-Registering Tools

10. Instance Personalization (Database-First)

11. Encrypted Skill Environment Variables

Main Dependencies

Commands

Room & Event Sources

Data Flow

Key Design Decisions

Database Tables

API Endpoints

Extension Points

Boot Sequence (`packages/engine/src/index.ts`)

Configuration (`.env`)