Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Nov 14, 2025

This PR attempts to address Issue #9266.

Problem

Qdrant was rejecting vectors containing null values with the error: "Format error in JSON body: data did not match any variant of untagged enum VectorStruct"

Solution

Added validation and sanitization in the upsertPoints method to:

  • Replace null/undefined/NaN values with 0 in vectors
  • Validate vector dimensions match expected size
  • Log warnings when invalid values are found

Changes

  • Modified src/services/code-index/vector-store/qdrant-client.ts to add vector validation
  • Added comprehensive tests for the new validation logic
  • Updated existing tests to properly handle vector dimensions

Testing

  • Added 3 new test cases for vector validation
  • All existing tests pass
  • Verified that invalid values are properly sanitized

Fixes #9266

Feedback and guidance are welcome!


Important

Sanitize and validate vectors in upsertPoints to prevent Qdrant rejections due to null values and dimension mismatches.

This description was created by Ellipsis for 518bc83. You can customize this summary. It will automatically update as commits are pushed.

- Replace null/undefined/NaN values with 0 in vectors
- Add dimension validation to prevent mismatches
- Add comprehensive tests for vector validation
- Fixes #9266: Error 400 when HTTP PUT to Qdrant with null values
@roomote roomote bot requested review from cte, jr and mrubens as code owners November 14, 2025 14:27
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Nov 14, 2025
@roomote
Copy link
Contributor Author

roomote bot commented Nov 14, 2025

Rooviewer Clock   See task on Roo Cloud

Review completed. Found 1 issue that should be addressed:

  • Console flooding: The warning in the vector sanitization map function will log once per invalid value, potentially flooding the console with hundreds of warnings for vectors with many null values. Refactor to log a single summary per point.

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment on lines +348 to +356
const sanitizedVector = point.vector.map((value) => {
if (value === null || value === undefined || isNaN(value)) {
console.warn(
`[QdrantVectorStore] Found invalid value in vector for point ${point.id}, replacing with 0`,
)
return 0
}
return value
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The console.warn is called inside the map function for every invalid value. If a vector has many null/undefined/NaN values (as shown in the issue example with hundreds of nulls), this will generate one warning per invalid value, flooding the console and impacting performance. Consider collecting invalid indices first and logging a single summary warning per point (e.g., "Found N invalid values in vector for point X at indices: [...]" or "Found N invalid values in vector for point X, replaced with 0").

Fix it with Roo Code or mention @roomote and request a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

[BUG] Error 400 when HTTP PUT to Qdrant: Format error in JSON body: data did not match any variant of untagged enum VectorStruct

3 participants