Prompt

The prompt is the single most important artifact Polyant produces. Every LLM call is the result of assembling, in a deterministic order, a static identity, a per-agent personality, a list of tools and skills, the relevant slice of conversation history, an optional summary of older turns, and a handful of contextual hints. This page is the reference for what goes in, in which order, and why.

If memory, knowledge, tools, and skills are the organs of an agent, the prompt is the bloodstream: every turn it gets rebuilt fresh and pumped into the model.

The system prompt is a join of 8 sections

The system prompt is composed of eight named sections, stored per-agent in the instance_prompts table. Each section has a stable key (01-identity … 08-datetime), a human title, and a markdown body. Default content for new agents lives in packages/engine/src/instances/defaults.ts.

Key	Title	Purpose
`01-identity`	Identity	Who the agent is, who it serves, what context it operates in.
`02-soul`	Soul (personality)	Tone, values, conversational style, information economy. The `updateSoul` tool rewrites this section.
`03-tooling`	Tooling	How to use tools — delegation, parallel calls, when to ask vs act. Includes the `{{toolCatalog}}` placeholder.
`04-safety`	Safety	Guardrails for destructive actions, error handling, refusal patterns.
`05-skills`	Skills	How to discover and load skills. Includes the `{{skillsList}}` placeholder.
`06-memory`	Memory	When to search memory and what to save. Omitted entirely when `memoryEnabled = false`.
`07-user-identity`	User identity	Per-agent profile of the typical user (audience, language, formality).
`08-datetime`	Datetime	Current date, time, and timezone. Filled via `{{datetime}}` and `{{timezone}}`.

Editing happens in the admin UI under each agent’s Prompts tab. Changes go straight to instance_prompts — there is no filesystem fallback. A 60-second TTL cache (getPrompts(instanceId)) shields the engine from repeated reads.

Three placeholders are expanded at assembly time

Three sections contain template tokens that are replaced fresh on every LLM call:

{{toolCatalog}} in 03-tooling → bulleted markdown list of tools enabled for this agent (- **toolName**: description). Generated from the tool registry filtered by the active turn’s gates: per-agent enablement, feature flags, missing secrets, and the runtime’s includeHarness set. See Tools → Conditional visibility for the full evaluation order — the same agent shows a different {{toolCatalog}} to a chat turn vs a Room cycle vs a webhook trigger.
{{skillsList}} in 05-skills → an <available_skills> XML block listing each enabled skill. Auto-loaded skills carry autoLoaded="true" and have their full markdown body inlined; on-demand skills only carry name + description, to be loaded later via the readSkill meta-tool.
{{datetime}} / {{timezone}} in 08-datetime → current values from the engine clock, so the LLM never invents a date.

Sections are joined with \n\n---\n\n as a hard separator. The model sees a clean, ordered document with horizontal rules between concerns.

Optional sections appended after section 8

Depending on the runtime path, up to three extra blocks are appended after 08-datetime:


## Current channel
You are talking via {channel}.
- Channel ID: {channelId}
- User name: {userName ?? "unknown"}

Injected whenever the supervisor receives a channelIdentity — that is, for every Inbound, Webhook, and Room turn. Plain web playground calls omit it.


## Conversation Context
{rendered contextPrompt}

Injected by the Webhook engine when an event source defines a contextPrompt template. The template is rendered against the webhook payload ({{event.field}} interpolation), inlined here, persisted to the conversation, and then cleared after the turn so it never leaks into a follow-up inbound message.


## Previous conversation context (summary)
{summary}

Note: this is a summary of earlier messages. When tool results from the current turn contain data (dates, names, figures), always use the current tool results — they take precedence over this summary.

Injected whenever the history has overflowed (see below) and a summary exists in the conversations.summary column. The closing note is verbatim and intentional: it tells the model that fresh tool output always wins against potentially stale summary content.

History is a sliding window with overflow summarization

For Inbound and Webhook turns, the pipeline loads the last 16 messages of the conversation (getRecentMessages(conversationId, 16)). Then:

No overflow (≤ 15 messages exist): all messages are passed as-is. No summary block.
Overflow (> 15 messages exist): only the last 10 messages are passed in the messages array. The older messages are condensed into the ## Previous conversation context (summary) block described above.

The summary itself lives in conversations.summary. It is updated after the response completes (fire-and-forget, via updateSummary() in runPipelinePost) — never blocking the user-visible reply. The first time a conversation overflows, the very next turn already has a summary available.

Room turns use a different policy: up to 50 messages are loaded, no summary, no sliding window. Room cycles are deliberately short-lived (a new conversation is created per cycle, room:{instanceId}:{timestamp}), so the 50-message budget rarely fills.

The full message array shipped to the LLM


system: <8 sections joined with "---">
        + optional "## Current channel"
        + optional "## Conversation Context"
        + optional "## Previous conversation context (summary)"

messages: [
  ...conversationHistory,       // last 10 (overflow) or all (≤15) messages
  { role: "user", content }     // the new turn
]

For multimodal Inbound turns (Telegram / WhatsApp / web playground with attachments), the new user content is a parts array mixing text and image/file references rather than a plain string. Everything else is unchanged.

Memory and Knowledge are NOT injected into the prompt

A frequent misconception: memories and knowledge documents are not part of the system prompt. They live behind tools (searchMemory, searchKnowledge) that the LLM invokes when section 06-memory or its own judgment tells it to. Their results arrive as tool-call outputs in the message stream, not as system text.

Why this matters:

Prompts stay short and stable. Token cost grows with conversation length, not with knowledge base size.
The agent decides when recall is worth a tool call, instead of every turn paying for a guess.
Search results are scoped to the query, so irrelevant facts don’t crowd the context window.

The trade-off is one extra round-trip when the agent does decide to search. Polyant accepts that latency cost as the price of clean separation.

Skills are mixed: list always present, content on-demand

Section 05-skills is the one place skills do enter the prompt — but with a careful split:

Discovery is eager: the <available_skills> block always lists every enabled skill (name + description). The model knows what exists.
Loading is lazy: only skills marked autoLoad = true ship their full body inline. The rest must be requested via readSkill(slug).

This keeps the prompt small for agents with dozens of skills while still letting the LLM “page in” what it needs. The selection of auto-loaded vs on-demand happens in the admin UI per agent.

What the supervisor sees vs what sub-agents see

The supervisor (the main loop) gets the full assembled prompt described above. Sub-agents spawned via spawnTask get a scoped variant: same identity / soul / safety sections, a filtered tool catalog (sub-agents cannot themselves call spawnTask, preventing infinite recursion), and a custom task brief in the user message. They run on tier standard with a hard cap of 10 steps.

Skills, memory section, and the history window are inherited from the supervisor’s context unless the sub-agent is invoked with an explicit override.

What is NOT in the prompt today

For transparency, here is what an agent does not receive automatically:

Prior conversations on the same channel (only the current conversationId is loaded).
The list of past tool invocations across conversations.
Other agents’ prompts or memories (each agent is fully isolated).
A global “platform” footer or watermark.
Any prompt-cache prefix headers (Polyant does not currently use Anthropic prompt caching — every turn rebuilds the full system prompt). This is on the roadmap as an optimization for static sections 01-04 and 07-08.

Anatomy of a single turn

To make the assembly concrete, here is the sequence executed by preparePipeline() → supervise() for one Inbound user message:


 1. resolveInstanceConfig(instanceId)        → AI provider, model, secrets
 2. getRecentMessages(conversationId, 16)    → up to 16 prior messages
 3. getSummary(conversationId)               → summary string, if any
 4. getContextPrompt(conversationId)         → webhook-injected, if any
 5. getPrompts(instanceId)                   → 8 sections (60s cache)
 6. discoverSkills(instanceId)               → enabled skills (env-gated)
 7. buildTools(instanceId)                   → enabled tools (DB join)
 8. buildSupervisorSystemPrompt({...})       → final system string
 9. messages = [...history, newUserMessage]  → LLM input
10. chat() / chatStream()                    → tier-routed provider call

Steps 1-7 are cheap (cached or single DB queries). Step 8 is pure string assembly. Step 10 is the one paid call. The pipeline is intentionally linear: each turn rebuilds the prompt from scratch, with no hidden state between turns beyond what the conversations and instance_prompts tables hold.

Code reference

Sections definition + defaults: packages/engine/src/instances/defaults.ts
Prompt assembly: packages/engine/src/agents/supervisor/prompt.ts (buildSupervisorSystemPrompt, loadSkillsList, generateToolCatalog)
History + summary loading: packages/engine/src/pipeline.ts (preparePipeline)
Sliding window logic: same file, lines 193-208
Summary write-back: packages/engine/src/pipeline.ts (runPipelinePost → updateSummary)
Per-agent prompt store: packages/engine/src/instances/prompts.store.ts (getPrompts, 60s TTL cache)