Skip to content

Conversation

@stefanosiano
Copy link
Member

@stefanosiano stefanosiano commented Oct 23, 2025

📜 Description

AndroidConnectionStatusProvider cache is now updated in the background, so lock is not acquired indefinitely

💡 Motivation and Context

There are a lot of ANRs in our Play console and in our own SDKCD tool, that report ANRs due to a lock in AndroidConnectionStatusProvider.
We were doing some IPC calls inside the lock, which may cause the ANR:

  • connectivityManager.getActiveNetwork()
  • connectivityManager.getNetworkCapabilities(activeNetwork)

The problem happens when the app goes to the foreground, as our AndroidConnectionStatusProvider onForeground is called, which runs updateCache in the background. At the same time, though, the ReplayIntegration runs updateCache on the main thread at the same moment (when the app goes to the foreground).
The ANR happens because of the mainThread awaiting the lock (it shouldn't happen, but maybe the IPC calls done in updateCache are the culprit)

💚 How did you test it?

📝 Checklist

  • I added GH Issue ID & Linear ID
  • I added tests to verify the changes.
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • I updated the docs if needed.
  • I updated the wizard if needed.
  • Review from the native team if needed.
  • No breaking change or entry added to the changelog.
  • No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

@stefanosiano stefanosiano marked this pull request as ready for review October 23, 2025 15:29
@github-actions
Copy link
Contributor

github-actions bot commented Oct 23, 2025

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 362.62 ms 420.72 ms 58.10 ms
Size 1.58 MiB 2.12 MiB 552.00 KiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
fcec2f2 328.91 ms 387.75 ms 58.84 ms
ee747ae 382.73 ms 435.41 ms 52.68 ms
7314dbe 437.83 ms 505.64 ms 67.81 ms
ee747ae 386.94 ms 431.43 ms 44.49 ms
27d7cf8 309.43 ms 364.27 ms 54.85 ms
b77456b 393.26 ms 441.10 ms 47.84 ms
17a0955 372.53 ms 446.70 ms 74.17 ms
27d7cf8 314.17 ms 347.00 ms 32.83 ms
b750b96 408.98 ms 480.32 ms 71.34 ms
d217708 355.34 ms 381.39 ms 26.05 ms

App size

Revision Plain With Sentry Diff
fcec2f2 1.58 MiB 2.12 MiB 551.50 KiB
ee747ae 1.58 MiB 2.10 MiB 530.95 KiB
7314dbe 1.58 MiB 2.10 MiB 533.45 KiB
ee747ae 1.58 MiB 2.10 MiB 530.95 KiB
27d7cf8 1.58 MiB 2.12 MiB 549.42 KiB
b77456b 1.58 MiB 2.12 MiB 548.11 KiB
17a0955 1.58 MiB 2.10 MiB 533.20 KiB
27d7cf8 1.58 MiB 2.12 MiB 549.42 KiB
b750b96 1.58 MiB 2.10 MiB 533.19 KiB
d217708 1.58 MiB 2.10 MiB 532.97 KiB

Previous results on branch: stefanosiano/fix/anr-connection-status

Startup times

Revision Plain With Sentry Diff
241e3d5 316.48 ms 362.29 ms 45.81 ms
4a6f9a1 333.42 ms 406.39 ms 72.97 ms
f891c2e 385.49 ms 461.89 ms 76.40 ms
f208591 320.37 ms 351.59 ms 31.22 ms

App size

Revision Plain With Sentry Diff
241e3d5 1.58 MiB 2.12 MiB 549.40 KiB
4a6f9a1 1.58 MiB 2.12 MiB 551.90 KiB
f891c2e 1.58 MiB 2.12 MiB 551.70 KiB
f208591 1.58 MiB 2.12 MiB 549.41 KiB

stefanosiano and others added 2 commits October 23, 2025 17:56
avoid concurrent cache updates
@cursor
Copy link

cursor bot commented Oct 23, 2025

Bug: Cache Update Lock Exception Handling

The isUpdatingCache flag might not reset if an exception occurs during the cache update, such as when acquiring the lock. This can permanently block all future connection status cache updates.

Fix in Cursor Fix in Web

avoid concurrent cache updates
cursor[bot]

This comment was marked as outdated.

avoid concurrent cache updates
cursor[bot]

This comment was marked as outdated.

@getsentry getsentry deleted a comment from cursor bot Nov 4, 2025
@getsentry getsentry deleted a comment from cursor bot Nov 4, 2025
@getsentry getsentry deleted a comment from cursor bot Nov 4, 2025
@getsentry getsentry deleted a comment from cursor bot Nov 4, 2025
@getsentry getsentry deleted a comment from cursor bot Nov 4, 2025
@getsentry getsentry deleted a comment from cursor bot Nov 4, 2025
@romtsn
Copy link
Member

romtsn commented Nov 6, 2025

@sentry review

return getConnectionStatusFromCache();
}

@Override
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactored updateCache() method is now asynchronous (submits work to a background executor), but callers like getConnectionStatus() and getConnectionType() expect synchronous behavior. These methods call updateCache() when the cache is invalid, then immediately return getConnectionStatusFromCache() without waiting for the background task to complete. This means the returned value will still be based on stale cache since the background update hasn't had time to execute. Consider either (1) making the callers wait for the background task, (2) documenting this async behavior clearly, or (3) providing a synchronous fallback path for initial cache population.
Severity: MEDIUM

🤖 Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location:
sentry-android-core/src/main/java/io/sentry/android/core/internal/util/AndroidConnectionStatusProvider.java#L427-L435

Potential issue: The refactored `updateCache()` method is now asynchronous (submits work
to a background executor), but callers like `getConnectionStatus()` and
`getConnectionType()` expect synchronous behavior. These methods call `updateCache()`
when the cache is invalid, then immediately return `getConnectionStatusFromCache()`
without waiting for the background task to complete. This means the returned value will
still be based on stale cache since the background update hasn't had time to execute.
Consider either (1) making the callers wait for the background task, (2) documenting
this async behavior clearly, or (3) providing a synchronous fallback path for initial
cache population.

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefanosiano this is my concern as well -- I think many things won't work as we expect them to with this change of behavior.

I think we should probably specifically solve this for Replay (and maybe continuous profiling?). We could reuse your isUpdatingCache flag, and then bail without acquiring the lock if it's true. Then when the cache is finally updated, it'd call onConnectionStatusChanged and notify replay/cont.profiling about the latest status. There's (potentially) a chance replay would read a stale status and continue recording for a short period of time, but I'm hoping this chance is small and wouldn't result in a bigger problem than the current number of ANRs caused by it. Wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romtsn So you're suggesting to return UNKNOWN when the cache is being updated in the background, right?

Because I think the issue of stale cache status happens just when the app goes to foreground, as that's the only time where we don't have the network listener updating the cache directly.

So in updateCache() I can check the current thread, and if it's on main it schedules in the background, otherwise (or the first time the cache is updated) it executes it right away
This way the onForeground() will update the cache and after that call the observers with the new status

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last known cached status, and if it's not available then UNKNOWN. which may (or may not) be the correct one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in updateCache() I can check the current thread, and if it's on main it schedules in the background, otherwise (or the first time the cache is updated) it executes it right away
This way the onForeground() will update the cache and after that call the observers with the new status

I think this sounds good, too. Just need to make sure we don't acquire the lock from resumeInternal (or getConnectionStatus() on the main thread)

while cache is updated, if no cache data exists, connection status is unknown and type is null
}
}
isUpdatingCache.set(false);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Infinite flag: caching functions return unknowns forever

If an exception occurs in updateCacheFromConnectivityManager() before reaching line 438, the isUpdatingCache flag remains true forever. This causes all subsequent calls to getConnectionStatus() and getConnectionType() to return UNKNOWN or null indefinitely. The flag should be reset in a finally block or the outer catch block should reset it.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants