-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: validate and sanitize vectors before sending to Qdrant #9267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Replace null/undefined/NaN values with 0 in vectors - Add dimension validation to prevent mismatches - Add comprehensive tests for vector validation - Fixes #9266: Error 400 when HTTP PUT to Qdrant with null values
Review completed. Found 1 issue that should be addressed:
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
| const sanitizedVector = point.vector.map((value) => { | ||
| if (value === null || value === undefined || isNaN(value)) { | ||
| console.warn( | ||
| `[QdrantVectorStore] Found invalid value in vector for point ${point.id}, replacing with 0`, | ||
| ) | ||
| return 0 | ||
| } | ||
| return value | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The console.warn is called inside the map function for every invalid value. If a vector has many null/undefined/NaN values (as shown in the issue example with hundreds of nulls), this will generate one warning per invalid value, flooding the console and impacting performance. Consider collecting invalid indices first and logging a single summary warning per point (e.g., "Found N invalid values in vector for point X at indices: [...]" or "Found N invalid values in vector for point X, replaced with 0").
Fix it with Roo Code or mention @roomote and request a fix.
This PR attempts to address Issue #9266.
Problem
Qdrant was rejecting vectors containing null values with the error: "Format error in JSON body: data did not match any variant of untagged enum VectorStruct"
Solution
Added validation and sanitization in the
upsertPointsmethod to:Changes
src/services/code-index/vector-store/qdrant-client.tsto add vector validationTesting
Fixes #9266
Feedback and guidance are welcome!
Important
Sanitize and validate vectors in
upsertPointsto prevent Qdrant rejections due to null values and dimension mismatches.upsertPointsinqdrant-client.tsnow sanitizes vectors by replacing null/undefined/NaN values with 0.upsertPointsto match expected size, throws error if mismatched.qdrant-client.spec.tsfor vector sanitization and dimension validation.Format error in JSON body: data did not match any variant of untagged enum VectorStruct#9266.This description was created by
for 518bc83. You can customize this summary. It will automatically update as commits are pushed.