Skip to content

Conversation

@junaid-shirur
Copy link
Collaborator

@junaid-shirur junaid-shirur commented Oct 29, 2025

Description

Removed the index property from the participants fields to enable partial (substring) search using matches.

Even without the index property, the field remains searchable via userInput and our existing rank profiles.

For more details, refer to the Vespa documentation.

Testing

Additional Notes

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 29, 2025

Walkthrough

Three Vespa search schema files were updated to remove the "index" directive from multiple field indexing configurations. The fields affected span across event, mail, and user schemas, reducing BM25 full-text search capability while maintaining attribute and summary indexing.

Changes

Cohort / File(s) Summary
Schema field indexing updates
server/vespa/schemas/event.sd, server/vespa/schemas/mail.sd, server/vespa/schemas/user.sd
Removed "index" directive from 7 fields across 3 schemas. Fields affected: attendeesNames (event), from/to/cc/bcc (mail), name/email (user). All changes follow pattern: index | attribute | summaryattribute | summary, removing BM25 full-text indexing while preserving attribute and summary capabilities.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Low complexity due to homogeneous, repetitive pattern applied consistently across all files
  • Changes are isolated to indexing configuration with no logic modifications
  • Review focus: validate that removing index directive aligns with search/query requirements and does not break existing search functionality that may depend on BM25 scoring for these fields

Suggested reviewers

  • zereraz
  • shivamashtikar
  • kalpadhwaryu
  • devesh-juspay

Poem

🐰 Seven fields hop and skip away,
Their index removed today,
BM25 takes a rest so sweet,
While attributes and summaries still compete,
Three schemas shine in simpler light! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "fix(vespa): fix search on participant fields" is partially related to the changeset. The changes specifically remove the BM25 index capability from multiple fields across three schema files, affecting fields like attendeesNames, from, to, cc, bcc, name, and email — all of which can be reasonably classified as "participant-related fields" in the context of events and communications. The title correctly identifies that this is a Vespa schema change affecting search indexing on participant fields. However, the title does not explicitly convey that the fix involves removing index capabilities rather than adding or enhancing them, and the PR description is empty with no substantive explanation of the intent or rationale for this change.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/participant_search

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @junaid-shirur, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the indexing strategy for various participant-related fields within the Vespa schemas for events, mail, and users. By removing the index property from the indexing clause of these fields, the change aims to correct search functionality and improve query efficiency. The affected fields will now primarily serve as attributes for filtering and sorting, and for summary retrieval, rather than being directly searchable via full-text indexing.

Highlights

  • Vespa Schema Updates: The index property has been removed from the indexing clause for several participant-related fields across the event.sd, mail.sd, and user.sd schemas.
  • Event Schema (event.sd): The attendeesNames field no longer includes index in its indexing definition, meaning it will primarily be used as an attribute and for summary retrieval.
  • Mail Schema (mail.sd): The from, to, cc, and bcc fields have had index removed from their indexing definitions, streamlining their use for attribute-based operations and summaries.
  • User Schema (user.sd): The name and email fields have been updated to remove index from their indexing definitions, focusing their utility on attribute and summary access.
  • Search Optimization: These changes are intended to fix search behavior and potentially optimize performance by ensuring that these specific fields are not used for direct full-text indexing, but rather for their attribute and summary capabilities.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix search behavior on participant fields by modifying Vespa schemas. The changes involve removing the index property from the indexing configuration for several fields in event.sd, mail.sd, and user.sd. While this correctly alters the indexing strategy, it leaves behind orphaned index configuration blocks and properties. My review highlights these redundant configurations, which should be removed to ensure schema clarity, improve maintainability, and prevent potential deployment issues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/vespa/schemas/user.sd (1)

15-20: Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but the index { enable-bm25 } block is still present. Without the index directive in the indexing pipeline, no text index will be created, rendering the BM25 configuration non-functional.

This field is actively used in rank profiles:

  • Line 205: bm25(email) in combined_bm25()
  • Line 216: nativeRank(email) in combined_nativeRank()
  • Multiple match-features references (lines 230, 246, 300, 330, 341)

Apply this diff to restore the index directive:

 field email type string {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     index {
         enable-bm25
     }
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c98fe31 and 71e6958.

📒 Files selected for processing (3)
  • server/vespa/schemas/event.sd (1 hunks)
  • server/vespa/schemas/mail.sd (1 hunks)
  • server/vespa/schemas/user.sd (1 hunks)

Comment on lines 82 to 85
field attendeesNames type array<string> {
indexing: index | attribute | summary
indexing: attribute | summary
index: enable-bm25
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but index: enable-bm25 is still present. In Vespa, the index directive in the indexing pipeline is required to create a text index. Without it, the field won't be indexed for text search, making the enable-bm25 setting non-functional.

This field is actively used in multiple rank profiles:

  • Line 195: bm25(attendeesNames) in combined_bm25()
  • Line 216: nativeRank(attendeesNames) in combined_nativeRank()
  • Lines 245, 304, 338, 367, 381: Match features

Removing the index directive will cause these rank expressions to fail or return zero scores.

Apply this diff to restore the index directive:

 field attendeesNames type array<string> {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     index: enable-bm25
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
field attendeesNames type array<string> {
indexing: index | attribute | summary
indexing: attribute | summary
index: enable-bm25
}
field attendeesNames type array<string> {
indexing: index | attribute | summary
index: enable-bm25
}
🤖 Prompt for AI Agents
In server/vespa/schemas/event.sd around lines 82 to 85, the field attendeesNames
has removed the "index" directive from its indexing pipeline but still declares
"index: enable-bm25", which leaves the field without a text index; restore the
indexing pipeline to include "index" (e.g., change the indexing line to include
index between attribute and summary) so the field is actually text-indexed and
the existing bm25/nativeRank/match features using attendeesNames continue to
work.

Comment on lines 60 to 64
field from type string {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but index: enable-bm25 is still present. Without the index directive in the indexing pipeline, no text index will be created, rendering the BM25 configuration non-functional.

This field is actively used in rank profiles:

  • Line 175: matches(from) in matchedFieldCount()
  • Lines 180, 184: nativeRank(from) in peopleRank() and simplePeopleRank()

Apply this diff to restore the index directive:

 field from type string {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     attribute: fast-search
     index: enable-bm25
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
field from type string {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
field from type string {
indexing: index | attribute | summary
attribute: fast-search
index: enable-bm25
}
🤖 Prompt for AI Agents
In server/vespa/schemas/mail.sd around lines 60-64, the field `from` currently
omits the `index` directive in its `indexing:` entry while still specifying
`index: enable-bm25`, which prevents creation of a text index and breaks BM25
usage in rank profiles (matches(from), nativeRank(from)). Restore the `index`
directive in the `indexing:` line so it includes `index` (e.g., `indexing: index
| attribute | summary`), keeping `index: enable-bm25` as-is; ensure whitespace
and order match project conventions.

Comment on lines 66 to 70
field to type array<string> {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but index: enable-bm25 is still present. Without the index directive in the indexing pipeline, no text index will be created, rendering the BM25 configuration non-functional.

This field is actively used in rank profiles:

  • Line 175: matches(to) in matchedFieldCount()
  • Lines 180, 184: nativeRank(to) in peopleRank() and simplePeopleRank()

Apply this diff to restore the index directive:

 field to type array<string> {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     attribute: fast-search
     index: enable-bm25
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
field to type array<string> {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
field to type array<string> {
indexing: index | attribute | summary
attribute: fast-search
index: enable-bm25
}
🤖 Prompt for AI Agents
In server/vespa/schemas/mail.sd around lines 66 to 70, the field definition
removed the text index from the indexing directive but left `index:
enable-bm25`, which is inconsistent and prevents creation of a text index needed
by BM25 and rank profiles; restore a text index by adding the appropriate
`index` token back into the `indexing` line (e.g., include `index` alongside
`attribute | summary` or replace the `indexing` value with `index | attribute |
summary`) so the `index: enable-bm25` setting has a corresponding text index to
apply to.

Comment on lines 72 to 76
field cc type array<string> {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but index: enable-bm25 is still present. Without the index directive in the indexing pipeline, no text index will be created, rendering the BM25 configuration non-functional.

This field is actively used in rank profiles:

  • Line 175: matches(cc) in matchedFieldCount()
  • Lines 180, 184: nativeRank(cc) in peopleRank() and simplePeopleRank()

Apply this diff to restore the index directive:

 field cc type array<string> {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     attribute: fast-search
     index: enable-bm25
 }
🤖 Prompt for AI Agents
In server/vespa/schemas/mail.sd around lines 72 to 76, the field "cc" has
removed the text index from its indexing pipeline but still keeps "index:
enable-bm25", which is useless without a text index; restore the text index by
adding "index: (enable-bm25)" back into the indexing directive (or include
"index" in the indexing line, e.g. "index: enable-bm25" within the indexing
pipeline) so a text index is created and BM25-based functions like matches(cc)
and nativeRank(cc) work again.

Comment on lines 78 to 82
field bcc type array<string> {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but index: enable-bm25 is still present. Without the index directive in the indexing pipeline, no text index will be created, rendering the BM25 configuration non-functional.

This field is actively used in rank profiles:

  • Line 175: matches(bcc) in matchedFieldCount()
  • Lines 180, 184: nativeRank(bcc) in peopleRank() and simplePeopleRank()

Apply this diff to restore the index directive:

 field bcc type array<string> {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     attribute: fast-search
     index: enable-bm25
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
field bcc type array<string> {
indexing: attribute | summary | index
indexing: attribute | summary
attribute: fast-search
index: enable-bm25
}
field bcc type array<string> {
indexing: index | attribute | summary
attribute: fast-search
index: enable-bm25
}
🤖 Prompt for AI Agents
In server/vespa/schemas/mail.sd around lines 78–82, the bcc field's indexing
pipeline accidentally dropped the index directive so the existing "index:
enable-bm25" has no effect; restore the indexing pipeline to include the index
directive (so a text index is created) while keeping the attribute and summary
options and preserving "index: enable-bm25" and "attribute: fast-search" so
matches(bcc) and nativeRank(bcc) used in the rank profiles continue to work.

Comment on lines 7 to 12
field name type string {
indexing: index | attribute | summary
indexing: attribute | summary
index {
enable-bm25
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Inconsistent indexing configuration will break search functionality.

The index directive has been removed from the indexing line, but the index { enable-bm25 } block is still present. Without the index directive in the indexing pipeline, no text index will be created, rendering the BM25 configuration non-functional.

This field is actively used in rank profiles:

  • Line 205: bm25(name) in combined_bm25()
  • Line 216: nativeRank(name) in combined_nativeRank()
  • Multiple match-features references (lines 230, 246, 299, 329, 341)

Apply this diff to restore the index directive:

 field name type string {
-    indexing: attribute | summary
+    indexing: index | attribute | summary
     index {
         enable-bm25
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
field name type string {
indexing: index | attribute | summary
indexing: attribute | summary
index {
enable-bm25
}
}
field name type string {
indexing: index | attribute | summary
index {
enable-bm25
}
}
🤖 Prompt for AI Agents
In server/vespa/schemas/user.sd around lines 7 to 12 the field "name" has an
indexing pipeline that omits the "index" directive while an index { enable-bm25
} block remains; restore the text index by updating the indexing line to include
"index" (e.g., indexing: index | attribute | summary) so the BM25 configuration
becomes effective and rank profile calls like bm25(name) / nativeRank(name) and
other match-feature references will work correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants