Memory

The Memory screen (/memory) shows the long-term knowledge the assistants have accumulated. Memory entries are scoped per-instance: an entry written by support-bot is invisible to sales-bot.

Retrieval model

Memory retrieval uses hybrid search — pgvector cosine similarity merged with PostgreSQL FTS via Reciprocal Rank Fusion (k=60). See packages/engine/src/memory/hybrid-search.ts. Vector results capture semantic similarity; FTS results capture exact keyword and phrase matches; RRF combines the two ranked lists without requiring per-side score calibration.

Embeddings always use OpenAI (text-embedding-3-small, 1536 dims) regardless of the instance’s chat provider — Anthropic has no embeddings API. See packages/engine/src/memory/embedder.ts. The Settings tab raises a warning when memoryEnabled is on but no OpenAI key is configured for the instance, surfacing the most common misconfiguration before it produces runtime errors.

Categories

Every memory is tagged with one of six categories:

Category	Use for	Example
`general`	Anything not in another category.	”Acme uses Slack as primary comms tool.”
`preference`	A user’s expressed preference.	”Alice prefers tables over prose answers.”
`fact`	A neutral factual statement about the user or domain.	”Bob’s office is in Munich.”
`event`	A specific dated event.	”Demo with Acme on 2026-05-12 at 15:00 CEST.”
`relationship`	A connection between two entities.	”Carol manages Dan’s team.”
`decision`	An explicit decision made.	”Decided to use Postgres for the prototype.”

Categories matter because memory search can be filtered by them, and because the auto-extractor uses them to decide priority.

Listing and searching

Pagination with the same scheme as Conversations.
Search — hybrid (vector + FTS) with Reciprocal Rank Fusion. The vector search uses OpenAI embeddings; the keyword search uses PostgreSQL FTS; the two ranked lists are merged using RRF (k=60).
Filters — by instance and by category.

Each row shows: content, category, importance (1–10), source conversation (link), created at.

The underlying /memories API call is instance-scoped: it requires an instanceId query parameter and returns HTTP 400 "instanceId is required" if called without one. The instance picker in the UI drives this param — there is no “all instances at once” view.

Manual creation

The New memory button lets a Superadmin or User insert memory manually. Fill in:

Content (free text)
Category (dropdown)
Importance (1–10)
Optional source conversation id

Manual memories are immediately available to the assistant on the next turn.

Automatic extraction

When memoryEnabled is on for an instance, the engine runs an extractor in the background after every assistant response:

The supervisor finishes its reply.
A fire-and-forget task fires.
The extractor takes the last 15 messages of the conversation, calls the LLM (tier fast) with an extraction prompt that returns structured JSON, and gets back a list of (content, category, importance) triples.
Each triple is embedded with OpenAI.
For each triple, a deduplication check runs: cosine similarity > 0.90 against an existing memory triggers an UPDATE; otherwise an INSERT.

The user never waits for extraction. If extraction crashes, the conversation is not affected.

Deletion

Delete a memory from its row (trash icon) or bulk-delete from the list. Memories are not soft-deleted; the row is gone immediately.