Skip to content

Conversation

@codesome
Copy link
Contributor

@codesome codesome commented Sep 16, 2025

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Adds thanos_query_endpoints metric to track healthy and unhealthy endpoints.

Use case: when not using strict endpoints, the endpoints will silently go stale and there is no way of knowing if queries/rules are getting wrong/no results (wrong results because the data can be partial with stale endpoints). This metric can be useful to alert on if endpoints become unhealthy (for example load-balancers being down).

If there is some other way to identify such silently failing queries, this PR can be closed in favor of that.

Verification

TestEndpointSetUpdate was panicking locally even without this change, so I have not written unit tests for this yet.

Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
Signed-off-by: Ganesh Vernekar <ganesh.vernekar@reddit.com>
@codesome codesome closed this Nov 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant