← All issues

[Wasm] Use a lazy restore frame when returning from tail calls

5223ee5

JSTests/wasm/stress/tail-call-cross-instance-gc.js

+// Test that GC during a cross-instance tail call chain doesn't corrupt the
+// RestoreInstanceCallee thunk frame. The thunk's Callee slot holds the
+// RestoreInstanceCallee singleton and CodeBlock holds a wasmInstance pointer;
+// both must survive GC scanning.
+ function triggerGC() { $vm.gc(); }
+ assert.eq(instPing.exports.start(1), 255);
+ assert.eq(instPing.exports.start(11), 255);
+ for (let i = 0; i < wasmTestLoopCount; i++) {
+ assert.eq(instPing.exports.start(1), 255);
+ assert.eq(instPing.exports.start(11), 255);
+ }

Each WebAssembly instance pins instance-specific data — memory base, memory bounds, and the instance pointer — in dedicated registers. When a tail call crosses instance boundaries (e.g., via return_call_indirect to a different module), the callee's instance clobbers those registers, and if the chain eventually returns to the original non-tail caller, those registers must be restored before it resumes.

This commit replaces the previous compile-time transitive tail call clobbering analysis with a runtime "restore frame" mechanism. When a Wasm tail call crosses instance boundaries for the first time, a 32-byte frame is lazily inserted just above the caller's argument area, capturing the caller's instance pointer and original return address; the rest of the frame is shifted down 32 bytes. A dedicated thunk (_wasm_restore_frame_return) reloads instance and memory registers on return. The patch removes callCanClobberInstance/computeTransitiveTailCalls and their inlining restriction, and adds ARM64E PAC re-signing plus a JIT cage gate thunk. A reuse check — comparing the current return PC against the thunk address — prevents restore-frame accumulation on repeated cross-instance hops, which the spec requires.

After (runtime restore frame, lazy insertion):
  Caller frame (inst A)
    arg area
    [32-byte restore frame inserted above args, first cross-inst call only]
      slot 0: original CallerFrameAndPC
      slot 1: saved instance A
      slot 2: RestoreFrameCallee
    callee-saves  <- entire frame shifted down 32 bytes

  return -> new cfr -> _wasm_restore_frame_return -> reload inst A regs -> jump to original retPC

  Reuse check: if retPC == thunk addr, skip insertion (no accumulation)

Cross-instance tail call chains can now inline freely, but ABI correctness now depends on a new runtime frame type that the GC, stack unwinder, sampling profiler, and three Wasm backends (BBQ, OMG, IPInt) must all handle consistently — including ARM64E PAC re-signing of the saved return PC. The dedicated GC stress test signals that the authors knew correctness here was a non-trivial risk.

🔒

New runtime Wasm stack frame mechanism with cross-backend ABI and pointer-auth implications — several security audit directions included.

Subscribe to read more