Skip to content

Conversation

@bencap
Copy link
Collaborator

@bencap bencap commented Nov 7, 2025

This pull request refactors the variant counting logic in the src/mavedb/routers/statistics.py file to improve performance and fix the count of variants. The main changes are explicitly querying against distinct variant identifiers and the introduction of a fast path for total counts, which avoids unnecessary grouping and aggregation when not requested, and a clearer separation between grouped and ungrouped queries.

Performance and Query Logic Improvements:

  • Explicitly query against distinct variant identifiers, so as to avoid double counting variants when they are materialized as more than one row. [1] [2]
  • Added a fast path to both variant_counts and mapped_variant_counts functions to quickly return the total distinct count when no grouping is requested, reducing query complexity and improving performance. [1] [2]
  • Refactored grouped queries to first materialize per-date distinct counts, then aggregate by month or year as needed, ensuring correct results and simplifying the code. [1] [2]
  • Added defensive fallback logic to ensure that even in unexpected cases, the functions return a valid count. [1] [2]

…tinct variants

- Clarifies distinct variant IDs for variants and mapped variants endpoints
- Adds distinct fast path query for both endpoints when grouping is not requested
@bencap bencap requested review from jstone-dev and sallybg November 7, 2025 17:59
@bencap bencap linked an issue Nov 7, 2025 that may be closed by this pull request
@bencap bencap merged commit 9876327 into release-2025.5.0 Nov 10, 2025
6 checks passed
@bencap bencap deleted the bugfix/bencap/511/statistics-variant-effect-measurement-double-counting branch November 10, 2025 21:57
@bencap bencap mentioned this pull request Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Statistics Page Variant Effect Measurements Count is Incorrect

2 participants