SharedArrayBuffer & Atomics
Shared-memory concurrency lets multiple workers read and write the same bytes without copying data across the thread boundary. The result is sub-millisecond coordination and zero serialization overhead β but it demands careful use of the Atomics API to avoid data races.
This page is part of the Web Workers Architecture & Communication reference. If you are new to Web Workers generally, start there for an overview of the thread model before diving into the shared-memory path covered here.
Why Shared Memory Bypasses Structured Clone
Every postMessage call runs the structured-clone algorithm on the payload, copies all reachable bytes into a new heap allocation in the target context, and then reconstructs the object graph. For a 10 MB typed array that means 10 MB copied twice (once out of source, once into destination) and a clone cost of roughly 12β18 ms on a mid-range laptop.
SharedArrayBuffer sidesteps this entirely. The operating system maps the same physical memory pages into every thread that holds a reference. No copy occurs β all threads read and write the same bytes in-place. The trade-off is that the programmer must manage visibility and ordering explicitly using the Atomics API, because the CPU and JIT compiler are free to reorder plain memory operations for performance.
Browsers disabled SharedArrayBuffer after the Spectre disclosure in 2018 because a shared timer (the implicit clock provided by shared memory) can be used to mount a side-channel attack. It was re-enabled in 2020 only when the page is cross-origin isolated. Your server must send both of these headers on every response for the document:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
Without them, typeof SharedArrayBuffer is "undefined" in all contexts on the page β including workers. See Debugging SharedArrayBuffer Cross-Origin Errors for a step-by-step diagnostic guide.
Prerequisites
Before working through the steps below, verify:
- Your server delivers
Cross-Origin-Opener-Policy: same-originandCross-Origin-Embedder-Policy: require-corp. - You are targeting Chrome 92+, Firefox 79+, or Safari 15.2+.
- Your worker scripts are module workers (
{ type: 'module' }) or classic scripts β both work, but module workers give you staticimport. - You understand basic
postMessage/onmessagepatterns (see Message Passing Strategies if not).
Step 1 β Allocate the SharedArrayBuffer
Allocate on the main thread and distribute via postMessage. Unlike a plain ArrayBuffer, sending a SharedArrayBuffer does not transfer ownership β all recipients share the same physical memory.
// main.ts
const SAB_BYTES = 4 * 1024; // 4 KB, multiple of Int32 (4 bytes)
const sab = new SharedArrayBuffer(SAB_BYTES);
const workerA = new Worker(new URL('./worker-a.ts', import.meta.url), { type: 'module' });
const workerB = new Worker(new URL('./worker-b.ts', import.meta.url), { type: 'module' });
// Both workers receive a reference to the SAME backing memory.
workerA.postMessage({ type: 'INIT', sab });
workerB.postMessage({ type: 'INIT', sab });
Sending a SharedArrayBuffer via postMessage is O(1) β the kernel copies a page-table mapping, not data. Regardless of buffer size, the postMessage call completes in under 0.05 ms.
Step 2 β Create Typed-Array Views
Wrap the SharedArrayBuffer in a typed array to address individual elements. Every view over the same SharedArrayBuffer is a live window into the same bytes β mutations made through one view are immediately visible through another (subject to memory ordering).
// worker-a.ts
let view: Int32Array;
self.onmessage = ({ data }) => {
if (data.type === 'INIT') {
view = new Int32Array(data.sab);
// view[0] is the same four bytes as view[0] in worker-b.ts
console.log('Worker A initialized, buffer length:', view.length); // 1024 Int32 slots
}
};
Choose your typed-array element type based on what you store. Int32Array is required for Atomics.wait and Atomics.notify. Float64Array gives you 64-bit precision but you cannot block-wait on it. Uint8Array is useful for byte-level flag arrays.
Allocate SharedArrayBuffer sizes as multiples of the largest element type you intend to use (8 bytes for Float64Array, 4 for Int32Array). Misaligned views throw a RangeError at construction time.
Step 3 β Atomics.load and Atomics.store
Never read or write a shared slot with plain index syntax when other threads may be accessing the same slot concurrently. Use Atomics.load and Atomics.store instead β they issue sequentially-consistent operations that the CPU cannot reorder past each other.
// worker-a.ts (producer)
// Slot 0 is a status flag: 0 = idle, 1 = data ready
Atomics.store(view, 0, 1); // Publish: data is ready
// worker-b.ts (consumer)
const status = Atomics.load(view, 0); // Guaranteed to see 1 if A stored 1 before this
if (status === 1) {
processData(view);
}
The key mental model: every Atomics.* call is a fence. Stores before the fence are visible to loads after the fence in other threads, as long as both sides use atomic operations. Plain array reads (view[0]) carry no such guarantee and may read stale values from CPU cache lines.
Atomics.load and Atomics.store on a cold cache line cost roughly 10β30 ns on x86 (one cache-coherence round-trip). On M-series ARM they are slightly cheaper. This is around 100Γ faster than a postMessage round-trip.
Step 4 β Atomics.add and compareExchange
Atomics.add performs a read-modify-write atomically. It is useful for lock-free counters, ring-buffer indices, and reference counts. Atomics.compareExchange is the building block for all lock-free algorithms: it atomically reads a slot, compares it against an expected value, and writes a new value only if the comparison succeeds β returning the old value either way.
// Atomic counter: multiple workers can safely increment without a mutex
const COUNTER_SLOT = 0;
// In any worker:
const prev = Atomics.add(view, COUNTER_SLOT, 1);
// prev is the value BEFORE the add; view[COUNTER_SLOT] is now prev + 1
// CAS (compare-and-swap) to implement a simple spinlock acquire:
function spinLockAcquire(view: Int32Array, lockSlot: number): void {
while (Atomics.compareExchange(view, lockSlot, 0, 1) !== 0) {
// 0 = unlocked, 1 = locked
// Keep spinning until we successfully swapped 0 β 1
}
}
function spinLockRelease(view: Int32Array, lockSlot: number): void {
Atomics.store(view, lockSlot, 0);
}
A tight spin-lock loop on a single-core machine (or a heavily loaded browser tab) will never make progress β the thread holding the lock cannot run. Prefer Atomics.wait / Atomics.notify (Step 5) for any wait that may last more than a few microseconds.
Step 5 β Atomics.wait, Atomics.waitAsync, and Atomics.notify
Atomics.wait parks the calling thread until another thread calls Atomics.notify on the same slot. It is modeled after the POSIX futex system call. The fourth argument is an optional timeout in milliseconds.
Atomics.waitAsync is the non-blocking Promise-based equivalent for the main thread (or any context where blocking is prohibited).
// worker-b.ts β consumer, parks until producer signals
self.onmessage = ({ data }) => {
if (data.type === 'INIT') {
const view = new Int32Array(data.sab);
const SIGNAL_SLOT = 1;
// Block this worker thread until slot 1 becomes non-zero.
// Returns "ok" | "not-equal" | "timed-out"
const result = Atomics.wait(view, SIGNAL_SLOT, 0, 5000 /* ms timeout */);
if (result === 'ok') {
const value = Atomics.load(view, 0);
self.postMessage({ type: 'RESULT', value });
}
}
};
// worker-a.ts β producer, writes data then wakes consumer
function produceAndSignal(view: Int32Array, value: number): void {
Atomics.store(view, 0, value); // Write the data
Atomics.store(view, 1, 1); // Set signal flag
Atomics.notify(view, 1, 1); // Wake at most 1 waiter on slot 1
}
// main.ts β non-blocking wait with waitAsync (main thread cannot call Atomics.wait)
async function waitForSignal(view: Int32Array, slot: number): Promise<void> {
const result = await Atomics.waitAsync(view, slot, 0).value;
if (result === 'ok') {
console.log('Signal received, value:', Atomics.load(view, 0));
}
}
Calling Atomics.wait from the main thread throws a TypeError: Atomics.wait cannot be called from the main thread. Always use Atomics.waitAsync in main-thread code. Worker threads may call either form.
Step 6 β Memory Ordering and Happens-Before
The ECMAScript memory model defines a sequentially consistent order for atomic operations. Every Atomics.* call participates in a total order that all agents observe consistently. However, there are subtle rules to keep in mind:
- Atomics only order other atomics. Plain array reads and writes (
view[i],view[i] = x) are unordered relative to atomic operations. If you need a non-atomic write to be visible before an atomic store, use a sequencing pattern: write all data first, then issue the atomic store as a release. - Data written before
Atomics.notifyis visible afterAtomics.waitreturns. The notify/wait pair establishes a happens-before edge. - SAB-backed typed arrays share cache lines. Two independently-modified slots that happen to fall on the same 64-byte cache line will cause false sharing between cores, degrading performance. Pad frequently-written slots to 64-byte boundaries when benchmarking reveals cache-line contention.
Two workers writing to different slots on the same 64-byte cache line can cause 3β5Γ throughput degradation compared to slots on separate lines. Align per-worker counters to 64-byte boundaries (slot = workerIndex * 16 for Int32Array) in high-throughput designs.
SVG Diagram: Shared Memory Region
Data-Transfer Strategy for This Pattern
How should you move data between workers? The right answer depends on buffer lifetime and ownership semantics:
| Strategy | Mechanism | Copy cost | Ordering guarantee | Best for |
|---|---|---|---|---|
| Structured clone | postMessage(obj) |
Full copy (~12β18 ms / 10 MB) | N/A β each side has its own copy | One-off payloads, complex object graphs |
| Transfer | postMessage(buf, [buf]) |
Zero (ownership moves) | N/A β only one owner at a time | Large one-shot buffers; producer β consumer pipelines |
| Shared memory | SharedArrayBuffer + Atomics |
None (shared pages) | Requires Atomics.* calls |
Concurrent read/write; ring buffers; low-latency signalling |
Use Transferable Objects & Zero-Copy when you need to move a buffer from one owner to another with no copy but without simultaneous access. Reach for SharedArrayBuffer + Atomics when multiple workers need to operate on the same data concurrently, or when you need coordination latency below the ~0.1 ms floor of postMessage.
Verification & Measurement
Once your shared-memory setup is running, verify correctness and measure performance with the following techniques.
Check cross-origin isolation at runtime:
function assertCrossOriginIsolated(): void {
if (!crossOriginIsolated) {
throw new Error(
'Page is not cross-origin isolated. ' +
'Add COOP: same-origin and COEP: require-corp headers.'
);
}
}
Measure atomic round-trip latency:
// main.ts β time a notify/wait round-trip
const sab = new SharedArrayBuffer(4);
const view = new Int32Array(sab);
const worker = new Worker('./echo-worker.js');
worker.postMessage({ sab });
const t0 = performance.now();
Atomics.store(view, 0, 1);
Atomics.notify(view, 0, 1);
worker.onmessage = () => {
const roundTrip = performance.now() - t0;
console.log(`Notifyβwaitβnotify round-trip: ${roundTrip.toFixed(3)} ms`);
};
Typical results on a 2023 laptop: 0.005β0.02 ms for an Atomics notify/wait pair versus 0.1β0.5 ms for a postMessage round-trip. The difference is most pronounced when messages are queued behind other messages in the workerβs event loop.
Failure Modes & Error Handling
SharedArrayBuffer is undefined β missing COOP/COEP. Add the headers. Check with crossOriginIsolated in the browser console.
TypeError: Atomics.wait cannot be called from the main thread β replace Atomics.wait with Atomics.waitAsync in any main-thread code path.
RangeError when constructing a typed-array view β buffer byte length is not a multiple of the element size, or the byte offset is misaligned.
Deadlock β Worker A waits for slot 1 to become 0; Worker B also waits for slot 1 to become 0 β neither can proceed. Design wait conditions so at least one thread always makes progress (avoid circular wait). Use timeouts in Atomics.wait to detect livelock: a result of "timed-out" is your signal that the expected signal never arrived.
Lost wake β a notify fires before the waiter calls wait, so wait parks forever. Defend by checking the slot value inside the wait idiom:
// Safe wait idiom: check condition before blocking
if (Atomics.load(view, SLOT) === EXPECTED) {
Atomics.wait(view, SLOT, EXPECTED); // only parks if still EXPECTED
}
Data races on non-atomic reads β if any code path reads a slot with view[i] while another thread writes with Atomics.store, you have undefined visibility. Audit every access path and replace bare reads on shared slots with Atomics.load.
Lock-Free vs Lock-Based Patterns
Lock-free algorithms β built with Atomics.compareExchange and careful ordering β guarantee system-wide progress: if one thread stalls, others continue. The ring buffer recipe is the canonical example.
Lock-based algorithms β built with Atomics.wait / Atomics.notify as a futex β are easier to reason about for complex critical sections but can deadlock if the thread holding the lock is terminated. For browser workers this is usually acceptable because worker crashes are observable and restartable.
Use lock-free for:
- Single-producer / single-consumer ring buffers
- Shared counters and reference counts
- Flag polling loops with known fast paths
Use mutex-style locks (futex) for:
- Multi-step critical sections spanning several slots
- Complex state machines requiring exclusive access
- Anything where a CAS loop would be difficult to reason about
The Coordinating Workers with Atomics.wait and notify page covers the mutex pattern in detail.
Browser Compatibility
| Feature | Chrome | Firefox | Safari | Edge |
|---|---|---|---|---|
SharedArrayBuffer (COOP/COEP required) |
92 | 79 | 15.2 | 92 |
Atomics (basic: load/store/add/sub/and/or/xor/exchange/compareExchange) |
60 | 57 | 10.1 (no SAB until 15.2) | 79 |
Atomics.wait (workers only) |
60 | 57 | 15.2 | 79 |
Atomics.notify |
60 | 57 | 15.2 | 79 |
Atomics.waitAsync |
87 | 100 | 16.4 | 87 |
crossOriginIsolated property |
87 | 72 | 15.2 | 87 |
Safari shipping SharedArrayBuffer only with COOP/COEP from 15.2 means your lowest common denominator for full support is Safari 15.2, released September 2021. Older Safari versions (10.1β15.1) shipped the Atomics object but without SharedArrayBuffer, making it effectively unusable for shared-memory patterns.
When to Choose Shared Memory Over Message Passing
Reach for SharedArrayBuffer when:
- Multiple workers need concurrent access to the same data β e.g., two decoders writing into separate halves of a shared output buffer that a third worker reads continuously.
- Coordination latency matters β a
postMessageround-trip is 0.1β0.5 ms. An Atomics notify/wait round-trip is 0.005β0.02 ms. For real-time audio processing (128-sample Web Audio blocks at 48 kHz = 2.67 ms per block) the difference is meaningful. - You are implementing a ring buffer or lock-free queue β these data structures require atomic index updates that are impractical over
postMessage.
Stick with Message Passing Strategies and transferable buffers when:
- You are moving a buffer from one owner to another with no concurrent access needed.
- The payload is sent once, not repeatedly polled.
- COOP/COEP headers are not feasible (e.g., embedding third-party iframes that do not set
crossorigincorrectly). - Code simplicity outweighs the latency win β structured clone or transfers are far easier to debug.
For a detailed decision rubric with a comparison table, see postMessage vs SharedArrayBuffer: When to Choose Each.
For deeper Worker Pool Management patterns that combine pools with shared memory, see the pool management guide.