SIMD shuffle strength-reduction in JSC's B3 backend
Source/JavaScriptCore/jit/SIMDShuffle.h
Source/JavaScriptCore/b3/B3ReduceSIMDShuffle.cpp
B3 (Bare Bones Backend) is JSC's high-level JIT IR, sitting between the Wasm/JS frontend and the final Air (assembly-level) backend. WebAssembly's i8x16.shuffle selects 16 output bytes from two 128-bit input vectors using a 16-byte index mask; B3 represents this as VectorSwizzle. The naive lowering emits ARM64's tbl instruction (vector table lookup), which is general but slow. ARM64 NEON has specialized instructions for common permutation patterns — UZP (deinterleave), ZIP (interleave), TRN (transpose), EXT (extract/rotate), and REV (reverse) — each executing in fewer cycles than a table lookup.
This commit adds a new B3ReduceSIMDShuffle phase that inspects VectorSwizzle byte-index patterns at compile time and substitutes the equivalent NEON instruction. Pattern matchers (tryMatchCanonicalBinary, tryMatchCanonicalUnary) test the 16-byte mask against known instruction semantics. The phase also handles VectorSwizzle chains: when a VectorSwizzle feeds another VectorSwizzle, composeShuffle algebraically folds the two permutation tables into one (result[i] = lhsMask[rhsMask[i]]) and re-runs pattern recognition on the composed result. A separate ARM64 SHA3 XAR lowering rule is introduced for XOR-rotate patterns, gated on isARM64_SHA3().
Wasm i8x16.shuffle
│
▼
B3 VectorSwizzle(a, b, mask[16])
│
├─► B3ReduceSIMDShuffle (new phase):
│ ├─ VectorSwizzle(VectorSwizzle(a,b,m1), c, m2)
│ │ └─► composeShuffle(m1, m2) → single VectorSwizzle
│ └─► tryMatchCanonical*(mask) → specialized opcode
│ UZP1/2, ZIP1/2, TRN1/2, EXT, REV, DupElement
│
└─► B3LowerToAir → ARM64 NEON instruction
tbl (generic fallback) or uzp1/zip1/ext/rev*/...
Significance
This adds substantial new JIT code generation paths for WebAssembly SIMD on ARM64, replacing generic table lookups with specialized NEON instructions — pattern-matching JIT transforms are historically rich ground for miscompilation bugs that can yield incorrect computation or security-relevant type confusion.