-
-
Notifications
You must be signed in to change notification settings - Fork 24
Sharing memory through copying
SQLite requires all connections to a WAL mode database to share memory for the WAL-Index.
This can be problematic in a Wasm enviroment, where each connection lives in its own sandbox, and access to memory sharing primitives might not be available.
Given those restrictions I devised a scheme that shares the WAL-index memory through copying at appropriate safe points, which has been used successfully by this projects Windows and dotlk VFSes.
The general idea is that I have a memory area that's shared by all connections (but inaccessible to them), and then each connection owns two copies of it: private (the copy handed out to SQLite) and shadow. I keep these in sync through two operations which I call acquire and release.
Acquire finds changes between shared and shadow, and copies them to private (and shadow); at the end shadow is an exact copy of shared. This gets updates from the shared memory to the private copy.
Release finds changes between private and shadow, and copies them to shared (and shadow); at the end shadow is an exact copy of private. This publishes private changes to the shared memory.
Given these primitives:
- after we map new memory (
xShmMap), we acquire it - before we unmap memory (
xShmUnmap), we release it - after we acquire a lock (
xShmLock), we acquire the memory - before we release a lock (
xShmLock), we release the memory - to issue a memory barrier (
xShmBarrier), we first acquire, then release
This works (I assume…) because SQLite needs to support platforms where reads/writes go to a cache and are not (necessarily) immediately visible to other connections.
Finding and applying differences is based on 32-bit words, for reasons explained in this comment. The WAL index has, in order:
- two 48 byte, checksummed, copies of the same information, only changed between barriers;
- checkpoint information on 32-bit aligned words
- data that only changes under an exclusive lock
Number (2) is the most sensitive, and why I use 32-bit words. I need to ensure I don't corrupt those values with partial reads/writes. These are also the fields SQLite uses atomics for (I assume to ensure no partial reads/writes). From testing, it doesn't seem required that they go straight to memory, but I may be gravely mistaken.
Most of the data in (1) is also 32-bit aligned. It doesn't worry me much because SQLite already has an additional private copy of this, keeps checksums, uses memcpy/memcmp, and barriers. ISTM any corruption I might introduce here only causes a performance degradation (synchronization fails and gets retried).
Everything in (3) is modified under the exclusive WAL_WRITE_LOCK, so doesn't worry me at all.
Doing all these back and forth copies can get slow. Not terribly slow, but slow. So there are a few additional optimizations, also detailed in the comment:
- assume that, if the WAL-index header doesn't change, neither do the hash tables
- assume that SQLite doesn't change the WAL-index without acquiring an exclusive lock
Number (1) means that, if the first 136 bytes are equal, I don't need to look at the rest of the 32KB (or more) of memory. There's an additional possible optimization that I'm not doing because I don't have a benchmark that shows it's worth it: if the first 16KB of any 32KB page did not change, the other 16KB haven't changed either.
Number (2) means that the rule "before we release a lock, we release the memory" only needs to apply for exclusive locks of any kind. If it's not an exclusive lock, we didn't change the memory, so there are no changes to publish.
I tested these assumptions through asserts for quite some time before coming to believe them.