Organize kick off automation #129

jasonyuezhang · 2025-11-13T23:31:16Z

PR Details

Reworking my approach to updating this function for triage-signals. I am removing the cache based check (added this a few days ago) and going to add an if else in this function for the new behaviour.
I am putting stuff that is going to be common for both approaches in functions so that it can be re-used.

Copied from getsentry#103345
Original PR: getsentry#103345

propel-test-bot · 2025-11-13T23:31:57Z

Refactor Seer automation kickoff & split permission / rate-limit checks

Refactors kick_off_seer_automation in src/sentry/tasks/post_process.py by extracting the long in-line guard logic into two reusable helpers: seer_automation_permission_and_type_check and seer_automation_rate_limit_check. The cache-based debounce previously used to avoid duplicate executions (seer_automation_queued:<group_id>) is removed; duplication avoidance now relies on an existing short-lived Redis lock plus the new helpers. Associated unit tests in tests/sentry/tasks/test_post_process.py are cleaned up: old cache/atomic-add scenarios are deleted, and focused tests for the new helpers are added. Mixin wiring is updated to include the new test mixin.

Key Changes

• Added seer_automation_permission_and_type_check and seer_automation_rate_limit_check to src/sentry/tasks/post_process.py
• Simplified kick_off_seer_automation to rely on the new helpers and issue-summary lock; removed cache debounce and redundant budget/flag checks now inside helper
• Deleted cache / atomic-add based tests; added SeerAutomationHelperFunctionsTestMixin with exhaustive unit coverage for both helpers
• Updated import lists and mixin registration to include new tests

Affected Areas

• src/sentry/tasks/post_process.py
• tests/sentry/tasks/test_post_process.py

This summary was automatically generated by @propel-code-bot

propel-test-bot · 2025-11-13T23:36:53Z

src/sentry/tasks/post_process.py

+def kick_off_seer_automation(job: PostProcessJob) -> None:
+    from sentry.seer.autofix.issue_summary import get_issue_summary_lock_key
+    from sentry.tasks.autofix import start_seer_automation
+
+    event = job["event"]
+    group = event.group
+
+    # Only run on issues with no existing scan - TODO: Update condition for triage signals V0
+    if group.seer_fixability_score is not None:
+        return
+
+    if seer_automation_permission_and_type_check(group) is False:
+        return
+
+    # Don't run if there's already a task in progress for this issue
+    lock_key, lock_name = get_issue_summary_lock_key(group.id)
+    lock = locks.get(lock_key, duration=1, name=lock_name)
+    if lock.locked():
        return

-    # cache.add uses Redis SETNX which atomically sets the key only if it doesn't exist
-    # Returns False if another process already set the key, ensuring only one process proceeds
-    if not cache.add(cache_key, True, timeout=600):  # 10 minute
+    if seer_automation_rate_limit_check(group) is False:
        return

    start_seer_automation.delay(group.id)


[CriticalError]

Critical: The removal of the cache-based deduplication mechanism (lines checking cache_key, cache.get(cache_key), group.seer_autofix_last_triggered, and cache.add(cache_key, True, timeout=600)) introduces a potential race condition.

The old code used cache.add() which is atomic (Redis SETNX) to ensure only one process proceeds. Without this, if multiple events for the same group are processed concurrently, start_seer_automation.delay(group.id) could be called multiple times for the same group.

The lock check on line 1693-1695 only prevents concurrent execution of this function, but doesn't prevent the same group from being queued multiple times if this function completes and is called again before the async task starts.

Consider: Was this deduplication moved elsewhere, or is the seer_fixability_score check on line 1685 intended to be sufficient? If so, document why multiple task queuing is now acceptable.

Context for Agents

[**CriticalError**] Critical: The removal of the cache-based deduplication mechanism (lines checking `cache_key`, `cache.get(cache_key)`, `group.seer_autofix_last_triggered`, and `cache.add(cache_key, True, timeout=600)`) introduces a potential race condition. The old code used `cache.add()` which is atomic (Redis SETNX) to ensure only one process proceeds. Without this, if multiple events for the same group are processed concurrently, `start_seer_automation.delay(group.id)` could be called multiple times for the same group. The lock check on line 1693-1695 only prevents concurrent execution *of this function*, but doesn't prevent the same group from being queued multiple times if this function completes and is called again before the async task starts. Consider: Was this deduplication moved elsewhere, or is the `seer_fixability_score` check on line 1685 intended to be sufficient? If so, document why multiple task queuing is now acceptable. File: src/sentry/tasks/post_process.py Line: 1700

propel-test-bot · 2025-11-13T23:36:54Z

src/sentry/tasks/post_process.py

    # check currently supported issue categories for Seer
    if group.issue_category not in [
        GroupCategory.ERROR,
        GroupCategory.PERFORMANCE,
        GroupCategory.MOBILE,
        GroupCategory.FRONTEND,
        GroupCategory.DB_QUERY,
        GroupCategory.HTTP_CLIENT,
    ] or group.issue_category in [
        GroupCategory.REPLAY,
        GroupCategory.FEEDBACK,
    ]:


[BestPractice]

The seer_automation_permission_and_type_check function checks the issue category with both an inclusion check and an exclusion check that overlap. Lines 1629-1639 check if the category is in a list that includes REPLAY and FEEDBACK, then immediately checks if it's in [REPLAY, FEEDBACK] and returns False. This logic is confusing.

The condition will always return False for REPLAY and FEEDBACK due to the second check, making their inclusion in the first list unnecessary. Consider simplifying:

if group.issue_category not in [ GroupCategory.ERROR, GroupCategory.PERFORMANCE, GroupCategory.MOBILE, GroupCategory.FRONTEND, GroupCategory.DB_QUERY, GroupCategory.HTTP_CLIENT, ]: return False

Context for Agents

[**BestPractice**] The `seer_automation_permission_and_type_check` function checks the issue category with both an inclusion check and an exclusion check that overlap. Lines 1629-1639 check if the category is in a list that includes `REPLAY` and `FEEDBACK`, then immediately checks if it's in `[REPLAY, FEEDBACK]` and returns False. This logic is confusing. The condition will always return False for REPLAY and FEEDBACK due to the second check, making their inclusion in the first list unnecessary. Consider simplifying: ```python if group.issue_category not in [ GroupCategory.ERROR, GroupCategory.PERFORMANCE, GroupCategory.MOBILE, GroupCategory.FRONTEND, GroupCategory.DB_QUERY, GroupCategory.HTTP_CLIENT, ]: return False ``` File: src/sentry/tasks/post_process.py Line: 1639

propel-test-bot · 2025-11-13T23:36:55Z

src/sentry/tasks/post_process.py

+    if group.seer_fixability_score is not None:
+        return
+
+    if seer_automation_permission_and_type_check(group) is False:


[BestPractice]

The comparison is False is used here and on line 1697. While technically correct, it's more Pythonic to use not seer_automation_permission_and_type_check(group) and not seer_automation_rate_limit_check(group) since these functions return boolean values, not None/False ambiguity.

Suggestion:

if not seer_automation_permission_and_type_check(group): return

Context for Agents

[**BestPractice**] The comparison `is False` is used here and on line 1697. While technically correct, it's more Pythonic to use `not seer_automation_permission_and_type_check(group)` and `not seer_automation_rate_limit_check(group)` since these functions return boolean values, not None/False ambiguity. Suggestion: ```python if not seer_automation_permission_and_type_check(group): return ``` File: src/sentry/tasks/post_process.py Line: 1688

Mihir-Mavalankar added 2 commits November 13, 2025 15:31

feat(autofix): Organize seer automation checks

c993bef

updated tests

5060d8b

propel-test-bot bot reviewed Nov 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Organize kick off automation #129

Organize kick off automation #129

jasonyuezhang commented Nov 13, 2025

Uh oh!

propel-test-bot bot commented Nov 13, 2025

Uh oh!

propel-test-bot bot Nov 13, 2025

Uh oh!

propel-test-bot bot Nov 13, 2025

Uh oh!

propel-test-bot bot Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Organize kick off automation #129

Are you sure you want to change the base?

Organize kick off automation #129

Conversation

jasonyuezhang commented Nov 13, 2025

PR Details

Uh oh!

propel-test-bot bot commented Nov 13, 2025

Uh oh!

propel-test-bot bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

propel-test-bot bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

propel-test-bot bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants