-
-
Notifications
You must be signed in to change notification settings - Fork 461
[ANR] Update Connection Status cache in the background #4832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…d, so lock is not acquired indefinitely
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| fcec2f2 | 328.91 ms | 387.75 ms | 58.84 ms |
| ee747ae | 382.73 ms | 435.41 ms | 52.68 ms |
| 7314dbe | 437.83 ms | 505.64 ms | 67.81 ms |
| ee747ae | 386.94 ms | 431.43 ms | 44.49 ms |
| 27d7cf8 | 309.43 ms | 364.27 ms | 54.85 ms |
| b77456b | 393.26 ms | 441.10 ms | 47.84 ms |
| 17a0955 | 372.53 ms | 446.70 ms | 74.17 ms |
| 27d7cf8 | 314.17 ms | 347.00 ms | 32.83 ms |
| b750b96 | 408.98 ms | 480.32 ms | 71.34 ms |
| d217708 | 355.34 ms | 381.39 ms | 26.05 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| fcec2f2 | 1.58 MiB | 2.12 MiB | 551.50 KiB |
| ee747ae | 1.58 MiB | 2.10 MiB | 530.95 KiB |
| 7314dbe | 1.58 MiB | 2.10 MiB | 533.45 KiB |
| ee747ae | 1.58 MiB | 2.10 MiB | 530.95 KiB |
| 27d7cf8 | 1.58 MiB | 2.12 MiB | 549.42 KiB |
| b77456b | 1.58 MiB | 2.12 MiB | 548.11 KiB |
| 17a0955 | 1.58 MiB | 2.10 MiB | 533.20 KiB |
| 27d7cf8 | 1.58 MiB | 2.12 MiB | 549.42 KiB |
| b750b96 | 1.58 MiB | 2.10 MiB | 533.19 KiB |
| d217708 | 1.58 MiB | 2.10 MiB | 532.97 KiB |
Previous results on branch: stefanosiano/fix/anr-connection-status
Startup times
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 241e3d5 | 316.48 ms | 362.29 ms | 45.81 ms |
| 4a6f9a1 | 333.42 ms | 406.39 ms | 72.97 ms |
| f891c2e | 385.49 ms | 461.89 ms | 76.40 ms |
| f208591 | 320.37 ms | 351.59 ms | 31.22 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 241e3d5 | 1.58 MiB | 2.12 MiB | 549.40 KiB |
| 4a6f9a1 | 1.58 MiB | 2.12 MiB | 551.90 KiB |
| f891c2e | 1.58 MiB | 2.12 MiB | 551.70 KiB |
| f208591 | 1.58 MiB | 2.12 MiB | 549.41 KiB |
avoid concurrent cache updates
avoid concurrent cache updates
avoid concurrent cache updates
… status synchronously
# Conflicts: # CHANGELOG.md
|
@sentry review |
...core/src/main/java/io/sentry/android/core/internal/util/AndroidConnectionStatusProvider.java
Show resolved
Hide resolved
| return getConnectionStatusFromCache(); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The refactored updateCache() method is now asynchronous (submits work to a background executor), but callers like getConnectionStatus() and getConnectionType() expect synchronous behavior. These methods call updateCache() when the cache is invalid, then immediately return getConnectionStatusFromCache() without waiting for the background task to complete. This means the returned value will still be based on stale cache since the background update hasn't had time to execute. Consider either (1) making the callers wait for the background task, (2) documenting this async behavior clearly, or (3) providing a synchronous fallback path for initial cache population.
Severity: MEDIUM
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location:
sentry-android-core/src/main/java/io/sentry/android/core/internal/util/AndroidConnectionStatusProvider.java#L427-L435
Potential issue: The refactored `updateCache()` method is now asynchronous (submits work
to a background executor), but callers like `getConnectionStatus()` and
`getConnectionType()` expect synchronous behavior. These methods call `updateCache()`
when the cache is invalid, then immediately return `getConnectionStatusFromCache()`
without waiting for the background task to complete. This means the returned value will
still be based on stale cache since the background update hasn't had time to execute.
Consider either (1) making the callers wait for the background task, (2) documenting
this async behavior clearly, or (3) providing a synchronous fallback path for initial
cache population.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanosiano this is my concern as well -- I think many things won't work as we expect them to with this change of behavior.
I think we should probably specifically solve this for Replay (and maybe continuous profiling?). We could reuse your isUpdatingCache flag, and then bail without acquiring the lock if it's true. Then when the cache is finally updated, it'd call onConnectionStatusChanged and notify replay/cont.profiling about the latest status. There's (potentially) a chance replay would read a stale status and continue recording for a short period of time, but I'm hoping this chance is small and wouldn't result in a bigger problem than the current number of ANRs caused by it. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romtsn So you're suggesting to return UNKNOWN when the cache is being updated in the background, right?
Because I think the issue of stale cache status happens just when the app goes to foreground, as that's the only time where we don't have the network listener updating the cache directly.
So in updateCache() I can check the current thread, and if it's on main it schedules in the background, otherwise (or the first time the cache is updated) it executes it right away
This way the onForeground() will update the cache and after that call the observers with the new status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the last known cached status, and if it's not available then UNKNOWN. which may (or may not) be the correct one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in updateCache() I can check the current thread, and if it's on main it schedules in the background, otherwise (or the first time the cache is updated) it executes it right away
This way the onForeground() will update the cache and after that call the observers with the new status
I think this sounds good, too. Just need to make sure we don't acquire the lock from resumeInternal (or getConnectionStatus() on the main thread)
while cache is updated, if no cache data exists, connection status is unknown and type is null
# Conflicts: # CHANGELOG.md
| } | ||
| } | ||
| isUpdatingCache.set(false); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Infinite flag: caching functions return unknowns forever
If an exception occurs in updateCacheFromConnectivityManager() before reaching line 438, the isUpdatingCache flag remains true forever. This causes all subsequent calls to getConnectionStatus() and getConnectionType() to return UNKNOWN or null indefinitely. The flag should be reset in a finally block or the outer catch block should reset it.
📜 Description
AndroidConnectionStatusProvider cache is now updated in the background, so lock is not acquired indefinitely
💡 Motivation and Context
There are a lot of ANRs in our Play console and in our own SDKCD tool, that report ANRs due to a lock in
AndroidConnectionStatusProvider.We were doing some IPC calls inside the lock, which may cause the ANR:
The problem happens when the app goes to the foreground, as our
AndroidConnectionStatusProvideronForegroundis called, which runsupdateCachein the background. At the same time, though, theReplayIntegrationrunsupdateCacheon the main thread at the same moment (when the app goes to the foreground).The ANR happens because of the mainThread awaiting the lock (it shouldn't happen, but maybe the IPC calls done in
updateCacheare the culprit)💚 How did you test it?
📝 Checklist
sendDefaultPIIis enabled.🔮 Next steps