← All issues

[JSC][WASM][Debugger] Fix STW deadlocks when VM blocks in memory.atomic.wait or WebCore operations

da1f31b

Source/WebCore/workers/WorkerSTWParticipation.h

+void waitWithSTWParticipation(BinarySemaphore& semaphore, VM& vm)
+{
+ static constexpr auto kPollInterval = 50_ms;
+ while (!semaphore.waitFor(kPollInterval)) {
+ if (vm.needStopTheWorld())
+ notifyVMStop(vm, VMStopped);
+ }
+}

Source/JavaScriptCore/runtime/WaiterListManager.cpp

- result = waiter.wait(timeout);
+ while (!waiter.waitFor(kDebuggerSTWCheckInterval)) {
+ if (vm.needStopTheWorld())
+ notifyVMStop(vm, WasmAtomicsWaitBlocked); // skips clearStop()
+ if (MonotonicTime::now() >= deadline)
+ break;
+ }
+ clearStop(vm); // only cleared on exit from waitForSync()

The WASM debugger halts all VMs using a Stop-The-World protocol: a global NeedStopTheWorld flag is set and the debugger thread waits until every participating VM decrements an active count by calling notifyVMStop(). JSC normally achieves this through trap check points embedded in the interpreter loop. Worker threads blocked inside memory.atomic.wait or synchronous WebCore operations (which drive their own run loop internally via BinarySemaphore::wait()) never reach a trap check point, so they never call notifyVMStop() — the STW count never reaches zero and the debugger hangs forever.

This commit adds polling-based STW participation at each blocking site via a new waitWithSTWParticipation() helper and modifies WaiterListManager::waitForSync() to poll every 50ms, calling notifyVMStop() when NeedStopTheWorld is set. A new WasmAtomicsWaitBlocked callback type preserves stop state across multiple STW cycles rather than clearing it on each check-in, because the atomics-wait site may be entered across multiple STW cycles before the wait completes.

The WasmAtomicsWaitBlocked callback deliberately skips clearStop() so stop data persists across multiple STW cycles; any error path that fails to call the compensating clearStop() on exit leaves the debugger looking at stale PC/CFR/stack data for a live, running thread.

🔒

New STW participation paths at multiple blocking sites introduce edge cases in state lifetime and epoch ordering — audit directions included.

Subscribe to read more