-
Notifications
You must be signed in to change notification settings - Fork 0
Organize kick off automation #129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Refactor Seer automation kickoff & split permission / rate-limit checks Refactors Key Changes• Added Affected Areas• This summary was automatically generated by @propel-code-bot |
| def kick_off_seer_automation(job: PostProcessJob) -> None: | ||
| from sentry.seer.autofix.issue_summary import get_issue_summary_lock_key | ||
| from sentry.tasks.autofix import start_seer_automation | ||
|
|
||
| event = job["event"] | ||
| group = event.group | ||
|
|
||
| # Only run on issues with no existing scan - TODO: Update condition for triage signals V0 | ||
| if group.seer_fixability_score is not None: | ||
| return | ||
|
|
||
| if seer_automation_permission_and_type_check(group) is False: | ||
| return | ||
|
|
||
| # Don't run if there's already a task in progress for this issue | ||
| lock_key, lock_name = get_issue_summary_lock_key(group.id) | ||
| lock = locks.get(lock_key, duration=1, name=lock_name) | ||
| if lock.locked(): | ||
| return | ||
|
|
||
| # cache.add uses Redis SETNX which atomically sets the key only if it doesn't exist | ||
| # Returns False if another process already set the key, ensuring only one process proceeds | ||
| if not cache.add(cache_key, True, timeout=600): # 10 minute | ||
| if seer_automation_rate_limit_check(group) is False: | ||
| return | ||
|
|
||
| start_seer_automation.delay(group.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
Critical: The removal of the cache-based deduplication mechanism (lines checking cache_key, cache.get(cache_key), group.seer_autofix_last_triggered, and cache.add(cache_key, True, timeout=600)) introduces a potential race condition.
The old code used cache.add() which is atomic (Redis SETNX) to ensure only one process proceeds. Without this, if multiple events for the same group are processed concurrently, start_seer_automation.delay(group.id) could be called multiple times for the same group.
The lock check on line 1693-1695 only prevents concurrent execution of this function, but doesn't prevent the same group from being queued multiple times if this function completes and is called again before the async task starts.
Consider: Was this deduplication moved elsewhere, or is the seer_fixability_score check on line 1685 intended to be sufficient? If so, document why multiple task queuing is now acceptable.
Context for Agents
[**CriticalError**]
Critical: The removal of the cache-based deduplication mechanism (lines checking `cache_key`, `cache.get(cache_key)`, `group.seer_autofix_last_triggered`, and `cache.add(cache_key, True, timeout=600)`) introduces a potential race condition.
The old code used `cache.add()` which is atomic (Redis SETNX) to ensure only one process proceeds. Without this, if multiple events for the same group are processed concurrently, `start_seer_automation.delay(group.id)` could be called multiple times for the same group.
The lock check on line 1693-1695 only prevents concurrent execution *of this function*, but doesn't prevent the same group from being queued multiple times if this function completes and is called again before the async task starts.
Consider: Was this deduplication moved elsewhere, or is the `seer_fixability_score` check on line 1685 intended to be sufficient? If so, document why multiple task queuing is now acceptable.
File: src/sentry/tasks/post_process.py
Line: 1700| # check currently supported issue categories for Seer | ||
| if group.issue_category not in [ | ||
| GroupCategory.ERROR, | ||
| GroupCategory.PERFORMANCE, | ||
| GroupCategory.MOBILE, | ||
| GroupCategory.FRONTEND, | ||
| GroupCategory.DB_QUERY, | ||
| GroupCategory.HTTP_CLIENT, | ||
| ] or group.issue_category in [ | ||
| GroupCategory.REPLAY, | ||
| GroupCategory.FEEDBACK, | ||
| ]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The seer_automation_permission_and_type_check function checks the issue category with both an inclusion check and an exclusion check that overlap. Lines 1629-1639 check if the category is in a list that includes REPLAY and FEEDBACK, then immediately checks if it's in [REPLAY, FEEDBACK] and returns False. This logic is confusing.
The condition will always return False for REPLAY and FEEDBACK due to the second check, making their inclusion in the first list unnecessary. Consider simplifying:
if group.issue_category not in [
GroupCategory.ERROR,
GroupCategory.PERFORMANCE,
GroupCategory.MOBILE,
GroupCategory.FRONTEND,
GroupCategory.DB_QUERY,
GroupCategory.HTTP_CLIENT,
]:
return FalseContext for Agents
[**BestPractice**]
The `seer_automation_permission_and_type_check` function checks the issue category with both an inclusion check and an exclusion check that overlap. Lines 1629-1639 check if the category is in a list that includes `REPLAY` and `FEEDBACK`, then immediately checks if it's in `[REPLAY, FEEDBACK]` and returns False. This logic is confusing.
The condition will always return False for REPLAY and FEEDBACK due to the second check, making their inclusion in the first list unnecessary. Consider simplifying:
```python
if group.issue_category not in [
GroupCategory.ERROR,
GroupCategory.PERFORMANCE,
GroupCategory.MOBILE,
GroupCategory.FRONTEND,
GroupCategory.DB_QUERY,
GroupCategory.HTTP_CLIENT,
]:
return False
```
File: src/sentry/tasks/post_process.py
Line: 1639| if group.seer_fixability_score is not None: | ||
| return | ||
|
|
||
| if seer_automation_permission_and_type_check(group) is False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The comparison is False is used here and on line 1697. While technically correct, it's more Pythonic to use not seer_automation_permission_and_type_check(group) and not seer_automation_rate_limit_check(group) since these functions return boolean values, not None/False ambiguity.
Suggestion:
if not seer_automation_permission_and_type_check(group):
returnContext for Agents
[**BestPractice**]
The comparison `is False` is used here and on line 1697. While technically correct, it's more Pythonic to use `not seer_automation_permission_and_type_check(group)` and `not seer_automation_rate_limit_check(group)` since these functions return boolean values, not None/False ambiguity.
Suggestion:
```python
if not seer_automation_permission_and_type_check(group):
return
```
File: src/sentry/tasks/post_process.py
Line: 1688
PR Details
Copied from getsentry#103345
Original PR: getsentry#103345