feat: add Prometheus metric for tracking unprocessed tasks #1459
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add a new Prometheus Gauge metric
flower_unprocessed_tasks_in_windowto tracktasks that have been processing (received but not completed) for longer than a
configurable time window.
Implementation details:
--unprocessed_tasks_window_minutesoption (any value > 0)Data structures:
time_buckets: Hash map of buckets containing task IDstask_to_bucket: Hash map for O(1) task-to-bucket lookupPerformance:
Use cases:
Detect unprocessed or stalled tasks: Identify tasks that have been started but remain incomplete for longer than the configured time window, indicating potential processing delays or scheduling bottlenecks.
Detect message loss in the broker or queue: Highlight scenarios where tasks disappear from the broker or queue system (e.g., RabbitMQ) but never reach the worker, suggesting possible message delivery or routing issues.
Detect worker termination during processing: Monitor for tasks that remain “in-progress” because their worker was unexpectedly killed or crashed mid-execution, helping isolate stability or resource exhaustion problems.
Example:
flower --unprocessed_tasks_window_minutes=60