Skip to Content
Polyant is open source under AGPL-3.0 — star us on GitHub.
ConceptsChannels

Channels

A channel is the surface where an end user (or an external system, or another agent) meets a Polyant instance. Channels are interchangeable transports: the supervisor and the pipeline never care whether a turn arrived from WhatsApp or from the OpenAI-compatible HTTP API. Each channel is implemented as a ChannelAdapter that normalises traffic into an IncomingMessage going in and an OutgoingMessage going out.

This page covers the full list of message sources, how channel credentials are scoped, how messy fragmented bursts are collapsed before they hit the LLM, how voice notes are transcribed, and how the in-process agent channel powers agent-to-agent calls.

The seven message sources

Polyant distinguishes two related but distinct concepts:

  • ChannelType — the narrow set of channels with stored, admin-API-configurable rows in instance_channels. The tuple is ["telegram", "slack", "whatsapp", "agent"] (see packages/engine/src/instances/channels.store.ts:29). The agent entry is API-configurable because per-instance agent-to-agent settings live in instance_channels just like the network transports — but its “credentials” are routing metadata, not external tokens.
  • MessageChannelType — the wide superset, covering every possible source of an inbound pipeline message. It adds web (REST API + Playground), scheduled (cron-style task), and room (event-driven cycles) to the four ChannelType entries.

Putting them together, the seven message sources are:

channelTypeDirectionConfigurableTransport / Trigger
telegramInbound + outboundYes (per-instance)grammY long polling
slackInbound + outboundYes (per-instance)@slack/bolt Socket Mode
whatsappInbound + outboundYes (per-instance)Twilio Programmable Messaging webhook
webInbound + outboundNo (always on)/v1/chat/completions REST + Playground
scheduledInbound (system)NoScheduler ticks
roomInbound (system)NoRoom scheduler (event-driven, 30s tick)
agentInbound + outboundYes (per-instance)In-process agent-to-agent invocation

The supervisor handles all seven the same way. The difference shows up only at the edges: which adapter received the message, and which adapter (if any) sends the reply.

Per-instance credentials, encrypted at rest

There are no global channel credentials in .env. Every channel config lives in instance_channels, keyed by (instance_id, channel_type), with the JSON payload AES-256-GCM encrypted via the engine’s ENCRYPTION_KEY. The admin panel’s Channels tab is the canonical write surface; the same data is exposed via PUT /api/instances/:slug/channels/:type and DELETE /api/instances/:slug/channels/:type.

The ChannelManager reads enabled configs from the database, instantiates the right adapter (Telegram, Slack, WhatsApp), and starts it. Adapters can be started, stopped, and reconfigured at runtime — no engine restart required. Channel boot is fire-and-forget: a Slack workspace whose socket hangs cannot block the rest of the engine.

Inbound message coordinator

Two of the human-facing channels — WhatsApp and Telegram — have a habit of arriving in fragments. A user types one thought across three quick bubbles, and three near-simultaneous webhooks land at the engine. Running the supervisor three times for what is conceptually one message wastes tokens and confuses the LLM.

The MessageCoordinator debounces fragmented bursts using a cancel-and-restart model:

  1. The first fragment arms a soft-debounce timer (MESSAGE_SOFT_DEBOUNCE_MS, default 2 s) and a typing-indicator timer (MESSAGE_TYPING_DELAY_MS, default 1.5 s).
  2. Additional fragments inside the soft-debounce window reset both timers.
  3. When the timer fires, the pipeline runs with a fresh AbortSignal.
  4. If a new fragment lands after the pipeline has started, the in-flight run is aborted, the buffered fragments are restored to the head of the buffer, and the soft-debounce re-arms — up to MESSAGE_MAX_RESTARTS (default 3) consecutive cancel cycles.
  5. After the cap, the next fragments accumulate and flush in a follow-up run.

DEBOUNCED_CHANNELS = {"whatsapp", "telegram"} — every other source skips the coordinator and runs the pipeline immediately. Aborted runs leave zero DB trace: no conversation row, no message row, no memory extraction. The Coordinator also schedules a typing indicator on WhatsApp (Twilio /v2/Indicators/Typing.json) for slow turns; on Telegram it uses native sendChatAction.

STT pipeline

Voice notes are first-class. When an adapter receives an audio attachment, the engine routes it through the stt-gateway before the supervisor sees the text. The default provider is OpenAI Whisper (whisper-1); the provider abstraction (STTProviderAdapter) is designed to host additional engines and Deepgram + AWS Transcribe adapters live alongside Whisper in packages/engine/src/stt-gateway/providers/ for future selection per-instance.

The transcribed text replaces the audio payload in the IncomingMessage, while the original audio reference is kept in attachment metadata for audit and later replay.

The agent channel

The agent channel is a virtual, in-process transport with no network hop. When agent A invokes agent B via the supervisor’s delegation path, the call lands on agent B’s AgentChannelAdapter, runs through B’s full pipeline (history, summary, supervisor, tools), and the reply is returned synchronously as a tool result on A’s side.

Each agent-to-agent call carries:

  • callerSlug — which instance initiated the call.
  • callerConversationId — the parent conversation that spawned this turn.
  • depth — recursion counter, capped to prevent infinite loops.
  • parentTraceId — used to stitch LangSmith traces into a single tree.

This is how compositions like “concierge agent delegates billing questions to billing agent” work without any external Slack/Telegram round-trip.

How it works

inbound side outbound side +-----------------------------------+ +-------------------------------+ | Telegram (grammY long poll) | | Telegram.sendMessage() | | Slack (Bolt Socket Mode) | | Slack.chat.postMessage() | | WhatsApp (Twilio webhook) | | WhatsApp / Twilio Programmable | | Web (POST /v1/chat...) | | SSE chunk stream | | Room (Room scheduler tick) | | (configured outbound channel) | | Scheduled (cron tick) | | (configured outbound channel) | | Agent (in-process call) | | (returns to caller) | +-----------------+-----------------+ +---------------+----------------+ | ^ v | +-----------------------+ +-------------------------+ | MessageCoordinator | | OutgoingMessage emit | | (whatsapp+telegram | | | | only — debounce) | +-------------------------+ +-----------+-----------+ ^ | | v | +--------------------------------+ | | Pipeline (handleMessage) | | | - resolve instance config | | | - load history + summary | | | - STT if audio attachment | | | - supervisor.run() +-------------------+ +--------------------------------+

Code reference

  • packages/engine/src/channels/types.tsChannelAdapter, MessageChannelType, IncomingMessage, OutgoingMessage, AgentCallMetadata.
  • packages/engine/src/channels/channel-manager.ts — Adapter orchestrator, DEBOUNCED_CHANNELS set.
  • packages/engine/src/channels/message-coordinator.ts — Fragment debouncer with cancel-and-restart.
  • packages/engine/src/channels/adapters/telegram/ — Telegram adapter (grammY).
  • packages/engine/src/channels/adapters/slack/ — Slack adapter (Bolt Socket Mode).
  • packages/engine/src/channels/adapters/whatsapp/ — Twilio Programmable Messaging adapter.
  • packages/engine/src/channels/adapters/agent.adapter.ts — In-process agent-to-agent channel.
  • packages/engine/src/channels/audio-transcription.ts — STT routing.
  • packages/engine/src/stt-gateway/providers/openai-whisper.ts — Default Whisper provider.
  • packages/engine/src/instances/channels.store.ts — Encrypted instance_channels CRUD and the narrow ChannelType union.

See also

Last updated on