Memory System
The Memory System enables AI mentors to remember information about users across conversations. It stores user preferences, learning progress, knowledge gaps, and personal context, making interactions more personalized and contextual.
Overview
The memory system supports two scopes of memory:
| Memory Type | Scope | Description |
|---|---|---|
| Global User Memories | All mentors | Facts about the user that apply everywhere (name, profession, preferences) |
| Mentor-Specific Memories | Single mentor | Context specific to interactions with a particular mentor |
Default Memory Categories (Mentor-Specific):
| Category | Slug | Description |
|---|---|---|
| Knowledge Gaps | knowledge_gaps | Topics or concepts the user struggles with |
| Learning Goals | learning_goals | Goals and objectives the user wants to achieve |
| Preferences | preferences | Learning style, pace, or content preferences |
| Progress Milestones | progress_milestones | Achievements and completed learning milestones |
| Personal Context | personal_context | Relevant personal information shared by the user |
Architecture
System Overview
The memory system uses PGVector for semantic search and consists of four main components that work together to extract, store, and retrieve memories.
The data flows through two paths:
- Extraction path (write): A Chat Session feeds into a background Celery Task containing the Extraction Service, which processes conversations and writes to the Memory Store (PGVector) β storing global memories, mentor memories, and their embeddings.
- Retrieval path (read): When generating a new chat response, the Context Service queries the Memory Store via Semantic Search (cosine distance), retrieves relevant memories, and injects them into the AI's context to produce a personalized response.
Components
| Component | Location | Purpose |
|---|---|---|
| MemoryExtractionService | services/memory_extraction.py | Extracts memories from conversations using LLM |
| MemoryStore | services/memory_store.py | Handles storage, deduplication, and retrieval |
| MemoryContextService | services/memory_context.py | Retrieves and formats memories for chat injection |
| Celery Tasks | tasks.py | Background processing for memory extraction |
Memory Extraction Flow
When a user sends a message, the system automatically extracts relevant memories in the background.
ββββββββββββββββββββ
β User sends β
β message β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β AI generates β
β response β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Celery task β
β triggered β
β (background) β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ βββββββββββββββββββ
β Check settings ββββββΆβ SKIP β
β - Tenant enabled?β No β (not enabled) β
β - Mentor enabled?β βββββββββββββββββββ
β - User enabled? β
ββββββββββ¬ββββββββββ
β Yes
βΌ
ββββββββββββββββββββ
β Get existing β
β memories summary β
β (for context) β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β SINGLE LLM CALL β
β β
β Input: β
β - User message β
β - AI response β
β - Categories β
β - Existing mem β
β β
β Output: β
β - has_memories β
β - global_memoriesβ
β - mentor_memoriesβ
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ βββββββββββββββββββ
β has_memories? ββββββΆβ SKIP β
β β No β (nothing to β
β β β extract) β
ββββββββββ¬ββββββββββ βββββββββββββββββββ
β Yes
βΌ
ββββββββββββββββββββ
β Deduplication β
β β
β 1. Hash check βββββΆ Skip if exact match
β 2. Semantic checkβββββΆ Skip if similar (cosine < 0.15)
ββββββββββ¬ββββββββββ
β Unique
βΌ
ββββββββββββββββββββ
β Generate β
β embedding β
β (1536 dimensions)β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Store in β
β PostgreSQL β
β with PGVector β
ββββββββββββββββββββ
Extraction Details
The extraction service uses a single LLM call to both decide if extraction is needed AND extract memories. This optimization reduces cost and latency.
LLM Input:
Categories:
- knowledge_gaps: Topics or concepts the user struggles with
- learning_goals: Goals and objectives the user wants to achieve
...
Existing memories (avoid duplicates):
Global:
- The user is a software engineer
Mentor-specific:
- [learning_goals] The user wants to learn Python
Latest Exchange:
User: I'm having trouble understanding recursion
Assistant: Let me explain recursion step by step...
LLM Output:
{
"has_memories": true,
"global_memories": [],
"mentor_memories": {
"knowledge_gaps": ["The user struggles with understanding recursion"]
}
}
Deduplication Strategy
The system uses a 3-layer deduplication approach to prevent duplicate memories:
| Layer | Method | Purpose |
|---|---|---|
| 1 | SHA-256 Hash | Fast check for exact duplicates |
| 2 | Semantic Similarity | PGVector cosine distance (threshold: 0.15) for near-duplicates |
| 3 | LLM Context | Existing memories shown to LLM to inform extraction |
Memory Injection Flow
When a user starts a new conversation, relevant memories are retrieved and injected into the AI's context.
ββββββββββββββββββββ
β User sends β
β new message β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ βββββββββββββββββββ
β Check mentor ββββββΆβ NO INJECTION β
β memory enabled? β No β β
ββββββββββ¬ββββββββββ βββββββββββββββββββ
β Yes
βΌ
ββββββββββββββββββββ βββββββββββββββββββ
β Check user ββββββΆβ NO INJECTION β
β use_memory β No β β
β enabled? β βββββββββββββββββββ
ββββββββββ¬ββββββββββ
β Yes
βΌ
ββββββββββββββββββββ
β Generate query β
β embedding from β
β user message β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Semantic search β
β β
β - Top 5 global β
β - Top 5 mentor β
β β
β (ordered by β
β cosine distance)β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Format as β
β markdown context β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β Inject into β
β system prompt β
ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββββββ
β AI generates β
β personalized β
β response β
ββββββββββββββββββββ
Injected Context Format
The AI receives memories formatted as:
## User Information
- The user is a software engineer with 5 years of experience
- The user prefers visual explanations with diagrams
## Relevant Context from Previous Conversations
- [Knowledge Gaps] The user struggled with understanding recursion
- [Learning Goals] The user wants to master system design patterns
- [Preferences] The user prefers Python code examples over pseudocode
Configuration Hierarchy
Memory features require enablement at three levels:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TENANT LEVEL β
β β
β Tenant Profile β Memory Tab β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Memory Configuration: [ENABLED/DISABLED] β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β If disabled, stops here β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β MENTOR LEVEL β β
β β β β
β β Mentor Settings β Memory Tab β β
β β ββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Enable Memory: [ON/OFF] β β β
β β β Memory Categories: [Configure...] β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β
β β β If disabled, stops here β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β USER LEVEL β β β
β β β β β β
β β β User Profile β Memory Settings β β β
β β β βββββββββββββββββββββββββββββββββββββ β β β
β β β β Auto-capture memories: [ON/OFF] β β β β
β β β β Use memories in responses: [ON/OFF]β β β β
β β β βββββββββββββββββββββββββββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SPA Configuration Guide
Enabling Memory for a Tenant
To enable the memory feature for your entire platform:
- Navigate to Tenant Profile page
- Select the Memory tab
- Toggle the memory configuration switch to enabled
Note: This is the master switch. Memory features will not work for any mentor or user until this is enabled.
Enabling Memory for a Mentor
Each mentor can have memory features enabled or disabled individually:
- Navigate to Mentor Settings for the specific mentor
- Select the Memory tab
- Toggle memory to enabled
Once enabled, the mentor will:
- Automatically extract memories from conversations
- Use stored memories to personalize responses
Managing Mentor Memory Categories:
From the Mentor Settings β Memory tab, administrators can:
- View default memory categories
- Create custom categories
- Edit category names and descriptions
- Deactivate categories (memories in that category will no longer be extracted)
Viewing User Memories for a Mentor:
The Mentor Settings β Memory tab also displays all memories stored for users interacting with that mentor, organized by category.
User Memory Settings
Users control their own memory preferences from their profile:
- Navigate to User Profile page
- Locate the memory settings section
- Configure the following options:
| Setting | Description |
|---|---|
| Auto-capture memories | When enabled, the system automatically extracts and saves memories from conversations |
| Use memories in responses | When enabled, stored memories are used to personalize AI responses |
Note: Users can disable memory features entirely for privacy, even if the tenant and mentor have memory enabled.
Managing Global User Memories
Global memories are facts about the user that apply across all mentors (e.g., "The user is a software engineer").
Location: User Profile page β Global Memories section
User Actions:
| Action | Description |
|---|---|
| View memories | See all automatically captured and manually added global memories |
| Add memory | Manually add a new global memory |
| Delete memory | Remove a memory that is no longer relevant or accurate |
Managing Mentor-Specific Memories
Mentor memories are specific to a user's interactions with a particular mentor.
Location: Mentor Settings page β Memory tab β User Memories section
User/Admin Actions:
| Action | Description |
|---|---|
| View memories | See all memories organized by category |
| Filter by category | View memories for a specific category (e.g., Knowledge Gaps) |
| Add memory | Manually add a memory to a specific category |
| Edit memory | Update the content of an existing memory |
| Delete memory | Remove a memory |
API Reference
Base URL: /api/ai-mentor/
Global Memories API
Endpoints: /orgs/{org}/users/{user_id}/global-memories/
List Global Memories
curl -X GET \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/global-memories/" \
-H "Authorization: Token "
Response:
{
"count": 2,
"results": [
{
"id": 1,
"content": "The user is a software engineer with 5 years of experience",
"is_auto_generated": true,
"created_at": "2024-01-15T10:30:00Z"
},
{
"id": 2,
"content": "The user prefers detailed code examples",
"is_auto_generated": true,
"created_at": "2024-01-14T09:15:00Z"
}
]
}
Create Global Memory
curl -X POST \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/global-memories/" \
-H "Authorization: Token " \
-H "Content-Type: application/json" \
-d '{
"content": "The user is preparing for a job interview"
}'
Response:
{
"id": 3,
"content": "The user is preparing for a job interview",
"is_auto_generated": false,
"created_at": "2024-01-16T14:00:00Z"
}
Delete Global Memory
curl -X DELETE \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/global-memories/3/" \
-H "Authorization: Token "
Response: 204 No Content
Mentor Memories API
Endpoints: /orgs/{org}/users/{user_id}/mentors/{mentor_id}/mentor-memories/
List All Mentor Memories (All Mentors)
curl -X GET \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/mentor-memories/" \
-H "Authorization: Token "
Response:
{
"count": 3,
"results": [
{
"id": 1,
"mentor": "python-tutor",
"category": {
"id": 1,
"name": "Knowledge Gaps",
"slug": "knowledge_gaps"
},
"content": "The user struggled with understanding recursion",
"is_auto_generated": true,
"created_at": "2024-01-15T11:00:00Z"
},
{
"id": 2,
"mentor": "python-tutor",
"category": {
"id": 2,
"name": "Learning Goals",
"slug": "learning_goals"
},
"content": "The user wants to master data structures",
"is_auto_generated": true,
"created_at": "2024-01-14T16:30:00Z"
}
]
}
List Memories for Specific Mentor
curl -X GET \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/mentors/python-tutor/mentor-memories/" \
-H "Authorization: Token "
Filter by Category
curl -X GET \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/mentors/python-tutor/mentor-memories/?category=knowledge_gaps" \
-H "Authorization: Token "
Create Mentor Memory
curl -X POST \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/mentors/python-tutor/mentor-memories/" \
-H "Authorization: Token " \
-H "Content-Type: application/json" \
-d '{
"content": "The user completed the Python basics course",
"category_slug": "progress_milestones"
}'
Response:
{
"id": 4,
"mentor": "python-tutor",
"category": {
"id": 4,
"name": "Progress Milestones",
"slug": "progress_milestones"
},
"content": "The user completed the Python basics course",
"is_auto_generated": false,
"created_at": "2024-01-16T15:00:00Z"
}
Update Mentor Memory
curl -X PATCH \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/mentors/python-tutor/mentor-memories/4/" \
-H "Authorization: Token " \
-H "Content-Type: application/json" \
-d '{
"content": "The user completed Python basics and intermediate courses"
}'
Delete Mentor Memory
curl -X DELETE \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/mentors/python-tutor/mentor-memories/4/" \
-H "Authorization: Token "
Response: 204 No Content
User Memory Settings API
Endpoints: /orgs/{org}/users/{user_id}/memsearch-settings/
Get User Settings
curl -X GET \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/memsearch-settings/" \
-H "Authorization: Token "
Response:
{
"auto_capture_enabled": true,
"use_memory_in_responses": true,
"updated_at": "2024-01-15T10:00:00Z"
}
Update User Settings
curl -X PUT \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/users/user123/memsearch-settings/" \
-H "Authorization: Token " \
-H "Content-Type: application/json" \
-d '{
"auto_capture_enabled": true,
"use_memory_in_responses": false
}'
Response:
{
"auto_capture_enabled": true,
"use_memory_in_responses": false,
"updated_at": "2024-01-16T16:00:00Z"
}
Memory Categories API
Endpoints: /orgs/{org}/mentors/{mentor_id}/memory-categories/
List Memory Categories
curl -X GET \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/mentors/python-tutor/memory-categories/" \
-H "Authorization: Token "
Response:
{
"count": 5,
"results": [
{
"id": 1,
"name": "Knowledge Gaps",
"slug": "knowledge_gaps",
"description": "Topics or concepts the user struggles with",
"is_active": true
},
{
"id": 2,
"name": "Learning Goals",
"slug": "learning_goals",
"description": "Goals and objectives the user wants to achieve",
"is_active": true
}
]
}
Create Custom Category
curl -X POST \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/mentors/python-tutor/memory-categories/" \
-H "Authorization: Token " \
-H "Content-Type: application/json" \
-d '{
"name": "Project Ideas",
"slug": "project_ideas",
"description": "Project ideas the user has expressed interest in"
}'
Update Category
curl -X PATCH \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/mentors/python-tutor/memory-categories/6/" \
-H "Authorization: Token " \
-H "Content-Type: application/json" \
-d '{
"description": "Project ideas and coding challenges the user wants to try"
}'
Deactivate Category
curl -X DELETE \
"https://api.ibl.ai/api/ai-mentor/orgs/acme/mentors/python-tutor/memory-categories/6/" \
-H "Authorization: Token "
Note: Deleting a category deactivates it (
is_active: false). Existing memories in that category are preserved but no new memories will be extracted for it.
API Endpoints Summary
| Method | Endpoint | Description |
|---|---|---|
| GET | /orgs/{org}/users/{user_id}/global-memories/ | List global memories |
| POST | /orgs/{org}/users/{user_id}/global-memories/ | Create global memory |
| DELETE | /orgs/{org}/users/{user_id}/global-memories/{id}/ | Delete global memory |
| GET | /orgs/{org}/users/{user_id}/mentor-memories/ | List all mentor memories |
| GET | /orgs/{org}/users/{user_id}/mentors/{mentor}/mentor-memories/ | List mentor-specific memories |
| POST | /orgs/{org}/users/{user_id}/mentors/{mentor}/mentor-memories/ | Create mentor memory |
| PATCH | /orgs/{org}/users/{user_id}/mentors/{mentor}/mentor-memories/{id}/ | Update mentor memory |
| DELETE | /orgs/{org}/users/{user_id}/mentors/{mentor}/mentor-memories/{id}/ | Delete mentor memory |
| GET | /orgs/{org}/users/{user_id}/memsearch-settings/ | Get user settings |
| PUT | /orgs/{org}/users/{user_id}/memsearch-settings/ | Update user settings |
| GET | /orgs/{org}/mentors/{mentor}/memory-categories/ | List categories |
| POST | /orgs/{org}/mentors/{mentor}/memory-categories/ | Create category |
| PATCH | /orgs/{org}/mentors/{mentor}/memory-categories/{id}/ | Update category |
| DELETE | /orgs/{org}/mentors/{mentor}/memory-categories/{id}/ | Deactivate category |
Technical Details
Embedding Specifications
| Property | Value |
|---|---|
| Dimensions | 1536 |
| Provider | OpenAI / Azure OpenAI |
| Storage | PostgreSQL with PGVector extension |
| Search Method | Cosine Distance |
Deduplication Thresholds
| Check | Threshold | Description |
|---|---|---|
| Hash Match | Exact | SHA-256 content hash comparison |
| Semantic Similarity | 0.15 | Cosine distance threshold (lower = more similar) |
Background Task Configuration
| Task | Queue | Timeout |
|---|---|---|
process_message_for_memory | ai_agent | 60s soft / 90s hard |
LLM Usage
Memory extraction uses a cost-optimized small model (e.g., gpt-4o-mini) to minimize costs while maintaining quality extraction.
Troubleshooting
| Issue | Possible Cause | Solution |
|---|---|---|
| Memories not being captured | Tenant memory not enabled | Enable in Tenant Profile β Memory tab |
| Memories not being captured | Mentor memory not enabled | Enable in Mentor Settings β Memory tab |
| Memories not being captured | User auto-capture disabled | User enables in their Profile settings |
| Memories not used in responses | User disabled "use memories" | User enables in their Profile settings |
| Duplicate memories appearing | Rare hash collision | Delete duplicate via API or SPA |
| Extraction taking too long | LLM provider latency | Check provider status, consider fallback |
| No categories showing | Categories not seeded | Categories auto-seed on first extraction |