Troubleshooting
A reference of common failures, grouped by surface area. Channel-specific issues live in Connect a Channel.
Login
“I cannot log in with Google.”
- Confirm your email’s domain is in the allow-list, if you configured one. The OSS default ships with no allow-list. To enforce one, set the
AUTH_ALLOWED_DOMAINSenv var (comma-separated domains, e.g.example.com,acme.io) —auth.config.tsreads it at startup. - Confirm
GOOGLE_CLIENT_IDandGOOGLE_CLIENT_SECRETare set inpackages/web/.env.local(Next.js does not read the monorepo-root.env). - Confirm the OAuth redirect URI configured in Google Cloud Console matches the admin panel URL.
“The Google sign-in button is missing on /login.”
GOOGLE_CLIENT_IDand/orGOOGLE_CLIENT_SECRETare unset. The provider auto-disables itself with a stderr warning log when either is missing. Set both and restart the web package.
“Initial admin password not visible — I never set INITIAL_ADMIN_PASSWORD.”
- The engine prints the random password once to its own stdout on the very first boot (when the
userstable is still empty). Scroll the engine log back to startup time and look for theINITIAL ADMIN CREATEDbanner. If you missed it and have no other admin, the simplest recovery is to clear theuserstable and boot once more — the seeder is idempotent only against a non-empty table.
“Credentials (email/password) login always returns 401.”
AUTH_INTERNAL_SECRETis missing, or its value differs betweenpackages/web/.env.localand the engine.env. The web package signs every credentials-verification request with it, and the engine rejects mismatches. Set the same value on both sides and restart.
“I can log in, but I am bounced back to /login on every page navigation.”
AUTH_SECRETis different inpackages/web/.env.localvs. the engine’s.env. They must match exactly.- Cookie domain mismatch behind a reverse proxy. Set
AUTH_TRUST_HOST=true.
“My session expired without warning.”
- JWT lifetime defaults to ~30 days. The token cannot be revoked early. Re-login.
Channels
“Telegram bot does not reply.”
- Bot token wrong, or two engines running with the same token (
Conflict: terminated by other getUpdates requestin logs). - Allowed-user-id whitelist excludes the tester. Try with the field cleared.
“Slack: app cannot connect.”
- App token wrong, or Socket Mode disabled in the Slack app settings.
- Missing scope: read the engine error message and add the listed scope on the Slack side.
“WhatsApp: nothing happens.”
- Webhook URL not reachable from Twilio. Test with
curlfrom the public internet. - Sandbox session expired (72 h of inactivity). Send
join <code>again.
“WhatsApp: Twilio reports 404 on the webhook.”
- The URL must be
https://<host>/webhooks/twilio/<instance-slug>/whatsapp. The legacy/intuitive form/webhooks/whatsapp/<slug>is not registered and Twilio will show a 404 in its console. - Confirm the instance slug in the URL matches the actual slug (admin panel → Instances → General).
“Channels won’t start at boot.”
- Channel startup is fire-and-forget:
startAllForInstance()is not awaited at boot. A misconfigured channel (most commonly Slack Socket Mode hanging on an invalid app token) is silent — the boot sequence carries on and the channel never reports ready. - Check the engine logs for
channelManagerwarnings around startup time. If you don’t see a “started” line for the expected channel within ~10 s of boot, the channel is stuck. - Workaround: restart the channel from Admin Panel → Channels → Restart. That path runs synchronously and surfaces the error.
Database
“pgvector extension not found.”
- You are not running the docker-compose Postgres image. Use
pgvector/pgvector:pg16(or install the extension in your existing Postgres).
“Migrations fail with permission denied on extension creation.”
- Run the very first migration as a Postgres superuser. Subsequent migrations work as a regular user.
“drizzle-kit generate fails with an ESM resolution error.”
- The bare CLI is broken in this repo because of an ESM/CJS interop bug in
drizzle-kititself. Use the wrapper script:npm run db:generate -w @polyant/engine. It invokestsx ../../node_modules/drizzle-kit/bin.cjs generate, which works around the issue. - No snapshot files are committed, so every
generateproduces a full-schema migration. For schema changes, write incremental migrations by hand inpackages/engine/src/database/migrations/rather than committing the auto-generated full dump.
Tools
“My tool doesn’t appear on the agent.” Walk down the list in order — each step is a separate failure mode:
- File name & path. The file must be at
packages/engine/src/agents/tools/*.tool.ts. The.tool.tssuffix is required;loadAllTools()filters by it. - Registered name. The string passed to
registerTool({ name: "..." })is case-sensitive and must match the name the supervisor expects. Typos here produce silent no-ops. - Per-instance enablement. Even a registered tool is only available to instances that have a row in
instance_toolsenabling it. Check via Admin Panel → Instance → Tools orSELECT * FROM instance_tools WHERE instance_id = .... - Skill remapping. Toggling skills on/off triggers
recomputeInstanceToolsautomatically, which may have disabled a tool the skill no longer requires. Re-enable it manually in the Tools tab. requiredEnvmissing. Tools that declarerequiredEnvare silently excluded if the env vars aren’t set on the instance. Check Admin Panel → Instance → Settings for the missing key.
“OpenAI strict-mode rejected my tool schema.”
- OpenAI strict mode imposes additional Zod constraints (no
.optional()without.nullable(), no unions of incompatible primitives, no top-level.union()with.literal()mixed). See the schema-authoring rules in concepts/tools.md. - Quick test: run the tool against Anthropic (no strict mode) first. If it works there but fails on OpenAI, it’s a strict-mode issue.
Activity feed
“The Activity feed is empty / silent.”
- Usually means the instance has no traffic, not that anything is broken. The bus is bounded in-memory (last 100 events). Send a test message to verify.
- The live indicator in the top-right distinguishes a connection problem (grey) from a quiet system (green). If grey, the SSE connection dropped — the page auto-reconnects, give it a few seconds.
- The feed does not persist across engine restarts. After a restart the buffer starts empty until new events arrive. For historical data use Conversations or Audit Logs.
- Filter state is in
localStorage. If you only see “system” events and not instance-scoped ones, check the instance filter (top-right of the page) — you may have hidden them in a previous session.
Provider
“Memory extraction throws every time.”
- The instance has memory enabled but no OpenAI key. Add an OpenAI key in Settings → OpenAI even if your chat provider is Anthropic.
“/v1/chat/completions returns 401 Invalid API key.”
- The instance has
authEnabled = truebut you sent the wrong bearer. The auth API key is the one configured in Settings → Auth API key, not your provider key.
Build the manual
“pandoc not found.”
- macOS:
brew install pandoc. Linux:apt install pandoc(older versions may need a backports/PPA install). - For markdown-only verification (no PDF, no LaTeX dependency), use
npm run docs:manual:md.
“xelatex not found.”
- macOS:
brew install --cask basictex, then in a fresh shellsudo tlmgr update --self && sudo tlmgr install <missing-pkg>if pandoc complains about a missing.sty. - Linux: install
texlive-xetex.
“MDX/JSX components found in shared content.”
- Someone wrote a JSX component (e.g.
<Callout>,<Tabs>) in a page undercontent/. The build script blocks JSX because pandoc cannot render it. Convert to a blockquote (> **Note.** ...) and rebuild. See theREADME.mdof the polyant-ai/docs repo for the full authoring rules.
“PDF builds but tables overflow the page.”
- Use the longest-line-wins layout: pandoc auto-sizes columns to the widest cell. If a single cell breaks layout, move it to its own line.