-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Memory System Feature: Add Embedded Postgres implementation for PR #564 #589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…582/airi into fix/hemisphere-mode-rendering
✅ Deploy Preview for airi-vtuber ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
❌ Deploy Preview for airi-docs failed. Why did it fail? →
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @gg582, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a foundational memory system, enabling the application to retain and utilize past interactions and generated insights. It provides a robust backend for persistent storage and intelligent retrieval of conversational data, enhancing the AI's ability to maintain context and provide more informed responses. The user interface has been updated to allow full control over the memory service configuration, ensuring flexibility and transparency.
Highlights
- New Memory Service Backend: Introduces a dedicated memory service with PostgreSQL and pgvector for storing and managing chat messages, AI responses, and extracted memories. This service runs as a separate backend component.
- Comprehensive Memory Management: The memory service supports ingestion of chat messages, storage of AI completions, LLM-powered extraction of memory fragments, entities, goals, and ideas, and vector-based semantic search for context building.
- Dynamic Embedding Regeneration: Implemented a system to regenerate embeddings for all stored content when embedding model settings are changed, with progress tracking and dynamic batch sizing to optimize performance.
- Integrated Settings UI: A new settings page allows users to configure the memory service connection, including URL, API keys, and selection of LLM and embedding providers (OpenAI, Google Gemini, local Ollama models) with various dimension options.
- Enhanced Chat History: The chat interface now loads initial conversation history from the memory service and includes a 'Load More History' feature, improving the user experience for long conversations.
- Automated Context Building: The system automatically builds a rich context for LLM queries by retrieving relevant recent messages, AI responses, semantic memories, structured knowledge (entities), and goal-oriented information from the memory service.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-architected memory service feature, including a backend service, database schema, and a comprehensive frontend configuration UI. The implementation is impressive, with good use of Docker, Drizzle ORM, and a clear separation of concerns. My review focuses on improving code quality by removing temporary development artifacts, enhancing robustness in data handling, and addressing potential bugs in the settings UI logic. Overall, this is a fantastic addition to the project.
| embeddingProvider.value = settings.embeddingProvider || '' | ||
| embeddingModel.value = settings.embeddingModel || '' | ||
| embeddingApiKey.value = settings.embeddingApiKey || '' | ||
| embeddingDim.value = settings.embeddingDimensions || 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaulting embeddingDim.value to 0 is problematic as 0 is not a valid dimension according to the available options. This could lead to an invalid state if the server doesn't provide a value. It's safer to default to a valid dimension, such as 1536.
embeddingDim.value = settings.embeddingDimensions || 1536
| embeddingProvider.value = '' | ||
| embeddingModel.value = '' | ||
| embeddingApiKey.value = '' | ||
| embeddingDim.value = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const oldestMessage = messages.value | ||
| .filter(msg => msg.role !== 'system') | ||
| .sort((a, b) => ((a as any).created_at || 0) - ((b as any).created_at || 0))[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to find the oldest message is unsafe. The messages array can contain ErrorMessage objects, which lack a created_at property. Accessing (a as any).created_at could lead to runtime errors or incorrect sorting. You should filter for messages that are guaranteed to have created_at and use a type guard to ensure type safety.
| const oldestMessage = messages.value | |
| .filter(msg => msg.role !== 'system') | |
| .sort((a, b) => ((a as any).created_at || 0) - ((b as any).created_at || 0))[0] | |
| const oldestMessage = messages.value | |
| .filter((msg): msg is UserMessage | ChatAssistantMessage => (msg.role === 'user' || msg.role === 'assistant') && 'created_at' in msg) | |
| .sort((a, b) => (a.created_at || 0) - (b.created_at || 0))[0] |
| const { messages, sending, streamingMessage, loadingInitialHistory, isLoadingHistory, hasMoreHistory } = storeToRefs(useChatStore()) | ||
| const { onBeforeMessageComposed, onTokenLiteral, loadInitialHistory, loadMoreHistory } = useChatStore() | ||
| const { onBeforeMessageComposed, onTokenLiteral } = useChatStore() | ||
| // Patch for eslint lintern | ||
| console.warn(!!loadingInitialHistory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable loadingInitialHistory is imported on line 11 but remains unused. The console.warn on line 15 is a temporary patch to suppress the linter warning. For cleaner code, you should remove the unused import and the patch.
const { messages, sending, streamingMessage, isLoadingHistory, hasMoreHistory } = storeToRefs(useChatStore())
const { onBeforeMessageComposed, onTokenLiteral, loadInitialHistory, loadMoreHistory } = useChatStore()
| const { messages, sending, streamingMessage, isLoadingHistory, hasMoreHistory } = storeToRefs(useChatStore()) | ||
| const { onBeforeMessageComposed, onTokenLiteral, loadInitialHistory, loadMoreHistory } = useChatStore() | ||
| const { onBeforeMessageComposed, onTokenLiteral } = useChatStore() | ||
| // Patch for eslint lintern | ||
| console.warn(!!sending, !!streamingMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variables sending and streamingMessage are imported on line 11 but are unused. The console.warn on line 15 is a temporary patch for the linter. Please remove the unused imports and the patch for cleaner code.
const { messages, isLoadingHistory, hasMoreHistory } = storeToRefs(useChatStore())
const { onBeforeMessageComposed, onTokenLiteral, loadInitialHistory, loadMoreHistory } = useChatStore()
| * - Managing API key authentication | ||
| */ | ||
|
|
||
| // TODO [lucas-oma]: remove console.debug before merging (eslint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| content: string | ||
| } | ||
|
|
||
| // TODO [lucas-oma]: remove console.debug and console.log before merging (eslint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * - Managing processing intervals | ||
| */ | ||
|
|
||
| // TODO [lucas-oma]: remove console.log comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| content_vector_1024: number[] | null | ||
| content_vector_768: number[] | null | ||
| }> { | ||
| // TODO [lucas-oma]: debug, this might be triggered twiced per message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * - Updating memory tables with structured data | ||
| */ | ||
|
|
||
| // TODO [lucas-oma]: remove console.log comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…582/airi into experimental/rag-custom-build
|
I merged PR #486 temporarily because I could not enter the chatroom, but you can revert it if you want to merge those two separately |
|
Reverted #486 |
|
I guess that currently working memory is not Embedded Postgres based long term memory. I prepared some bare-bone codes. But it is kinda strange that the memory worked, I guess it is IndexedDB Also, if a user turns on long-term memory, short-term memories(?) from current IndexedDB should be pushed into database too. |
|
I added PR on different branch named |
Description
Using Embedded Postgres for non-tech users can be a good solution.
Goal: By toggling boolean from Settings UI, user can turn on/off it.
Linked Issues
#564