[12] RegExp::byteCodeCompileIfNecessary not thread-safe vs concurrent compiler thread
Severity: Medium | Component: JSC Yarr regex engine | dcf25ed
Rated Medium because the diff fixes a TOCTOU race between the mutator thread and a JSC concurrent compiler thread on a std::unique_ptr<BytecodePattern> field of RegExp; under interleaving the race produces undefined behavior, but the trigger requires very narrow timing the commit message confirms is not reachable without artificial sleeps.
RegExp::matchConcurrently shouldn't cause any compilation, but bail out if JIT code for the regexp doesn't already exist. However, it is possible that a RegExp has JIT code but no bytecode, in which case matchConcurrently can incorrectly racily attempt to compile bytecode.
Source/JavaScriptCore/runtime/RegExp.cpp
void RegExp::byteCodeCompileIfNecessary(VM* vm)
{
+ Locker locker { cellLock() };
+
if (m_regExpBytecode)
return;
Source/JavaScriptCore/runtime/RegExpInlines.h
if (result == static_cast<int>(Yarr::JSRegExpResult::JITCodeFailure)) {
- // JIT'ed code couldn't handle expression, so punt back to the interpreter.
- byteCodeCompileIfNecessary(&vm);
- if (m_state == ParseError)
- return throwError();
+ // Only the mutator may compile bytecode; the compiler thread must use the
+ // bytecode that already exists, and bails out if there is no bytecode.
+ if constexpr (matchFrom == Yarr::MatchFrom::VMThread) {
+ byteCodeCompileIfNecessary(&vm);
+ if (m_state == ParseError)
+ return throwError();
+ }
+ if (!m_regExpBytecode)
+ return -1;
Patch Details
Two changes: byteCodeCompileIfNecessary now takes the cell lock (Locker locker { cellLock() }) before checking and writing m_regExpBytecode. In both overloads of matchInline, the JIT-failure fallback path is restructured: the call to byteCodeCompileIfNecessary is now gated by if constexpr (matchFrom == Yarr::MatchFrom::VMThread), and an additional if (!m_regExpBytecode) return -1; bails the compiler thread out when bytecode has not been produced. The commit message states the in-matchInline check intentionally does not take the cell lock because the caller matchConcurrently already holds it.
TOCTOU race on a lazily-compiled cell member between the mutator thread and a JSC concurrent compiler thread that was supposed to be read-only.
Background
JSC compiles a regular expression lazily into two possible forms: Yarr JIT machine code (m_regExpJITCode) for patterns the JIT can handle, and a Yarr bytecode pattern (m_regExpBytecode) interpreted by Yarr::interpret. At runtime, JIT-compiled regex code can return Yarr::JSRegExpResult::JITCodeFailure for inputs it cannot handle, at which point execution punts to the bytecode interpreter. RegExp::matchConcurrently is used by JSC's concurrent compiler threads (DFG/FTL) — when the optimising compiler tries to constant-fold or speculate on a regex match during compilation, it may invoke matching from a thread other than the main VM thread. The convention is that concurrent compiler threads observe heap state without mutating it. cellLock() is a per-cell lock used to synchronise rare cases where mutator-side mutation must be visible to compiler-thread reads. MatchFrom is a template parameter distinguishing VMThread from CompilerThread calling contexts.
Analysis
RegExp::matchConcurrently runs on a JSC concurrent compiler thread and is intended to read pre-existing JIT code only, not to mutate the RegExp cell. However, a RegExp can be in a state where JIT code exists but the bytecode does not. If the JIT returns JSRegExpResult::JITCodeFailure, the previous code unconditionally called byteCodeCompileIfNecessary on whichever thread was running — including the compiler thread. Meanwhile, the mutator thread executing a regular regex match on the same RegExp could be entering the same path. Two threads could thus check if (m_regExpBytecode) simultaneously, both see null, and both proceed to allocate and assign a new BytecodePattern to the std::unique_ptr member.
Aaa Aaaa Aaaaaa Aaaaaa Aa Aaaaaaaaaaaaaa Aaaaaa Aa Aaa Aaaaaaaaa Aaaaa Aaaaaaaa Aa a Aaaaaaaaaa Aaaaa Aa a Aaaaaaaaaaaa Aaaaa a Aaaaaaa Aaaaaaaaaa Aaaaaaaaaaaa Aa Aa Aaaaaa Aaaaa Aaaaaaaa Aaa Aaaaaaaaaaa Aa Aaaaaa Aaaaaaaaaaaaaaaaa Aaaaaaa a Aaaa Aaaaa Aaaa Aaa Aaaaaa Aaaaaaaa Aaaaaaaaaaa Aa a Aaaa Aaaaaaaaaa a Aaa Aaaaaa Aaaaaaaaaaa a Aaaaaaaaaaaa Aaaaa Aaaaaaaa Aaaaa Aaa Aaaaa Aaaaaa Aa Aaaaa Aaaaaaaaa Aaa Aaaaaaa Aa Aaaaaaaaaaaaaa Aa Aaa Aaaaa Aaaaaaaaaaaaaaaaa Aaaaa Aa Aa Aaaaa Aaaaaaaaaaaa
Aaaa Aaaaaaaaaaaaa Aaaaaaaa Aaa Aaaaaaaaaaaaa Aaaaaaaaa Aaaa Aaaaaaaa Aaaa Aaaaa Aa Aaaaaaa Aaaa Aa Aaa Aaaaaaa Aaaaaa Aaaaa Aaaaaaaaaa Aaa Aaaaaaaaa Aaaaaaa Aa Aaaaaaaaaa Aaaaaaaaaa Aaaaaaaaaaaa Aa a Aaaaaaaaaaaa Aaaaa Aaaaaa a Aaa Aaaa Aaaaa Aaaaaaaa Aaa Aaaaaa Aaaaaaaaa Aa Aaaaaaaaa Aaaaaaaaa Aaa Aaaaa Aaaaa Aaaaaaaaaaaaaa Aa Aaaaaaaaaa Aaaaaaaaaa Aa Aaaaaaaaaaaaaaaaaaa Aaa Aaa Aaaa Aaaaaaaaaaaa Aaa Aaaaaaaaaaaaa Aaaa a Aaaaaaaaaaaa Aaaaaaaaaaa Aaaa Aa Aaaaaaa Aaa Aaaaaaaa Aaaaaa Aaaa Aaaaaaaaaa Aaaaaaaaaaa Aa Aaa a Aaa Aaaaaa Aaaaaaa Aa Aaa Aaaa Aaaaaa Aaaaa Aaaaaaa Aa Aaaaaaa Aaa Aaaaaaaa Aaaa Aaa Aaaaaaaaaaaaaaa Aaaa Aaaaaa Aaaa Aaaaaaaaaaa Aaa Aaaaaaaa Aaa Aaa a Aaaaaaaaaaaaaaaa Aaaaaa Aa a Aaaa Aaaa Aaa Aa Aaaaaaa Aaaa a Aaaaaaaaaa Aaaaaaaa Aaaaaaa Aaa Aaaaaa Aaaa Aaaaaa Aaaaaaa Aa a Aaaaaa Aaaaa
🔒Detailed thread-interleaving analysis and an assessment of whether this race is reachable and corruptible from web content.
Subscribe to read more
Audit directions
a Aaaaaa Aaaaaaaaaaaaaaaaaaaaaaaaaa Aa Aaaa Aaaaaaa Aaaaaaaaa Aaaa Aaa Aaaaaaaaaa Aaaaaaaa Aaaaaaaaaa Aaaaa Aaaaa Aaaaaa Aa Aaaaa Aaaa Aaaa Aa Aaaaaaaaa Aa Aaaaa Aaa Aaaaaaaaa Aaaaaaa Aaaaaaaaaaaaa Aaaaaaa Aaaaaaaaaaaaaaaaa Aaaaaaa Aaa Aa Aaaaaaaaa Aaaaaaa Aaa Aaaaaaaaaaaaaaaaaaaaaaaaa Aaaa Aa Aaaaaaa Aaaaaaa Aaaaaaaaaaaaaaaaaaaaa Aaaaaaaa Aaaaaaaa Aaaa Aaa Aaaaaaaaaaaaaaaaaaaa Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Aaa Aaa Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Aaaaaaa Aa Aaaaaaaa Aaaaaaaaaaa Aaa Aaaaa Aaaaa Aaaaaaa Aaa Aaaaaa Aa Aaaaaaaa Aaaa a Aaaaaaaaaaa Aaaaaaa Aaa Aaaaaaa Aa Aaaaa Aaaaaaaaaaaaa
a Aaaaaaaaaaaaaaaaaaa Aaaaaaaaaaaaaa Aaaaaaaa Aaaaaaaaaa Aaaa Aaaa Aaaaaaaaa Aa Aaaaaaa Aaaaaaa Aaaaa Aaaaa Aaa Aaaaaaaaaa Aaaa Aaaa Aaaa Aaaaaaaaaa Aaaaaa Aaaaaaaaaaaaaaaa Aaaaaaaa Aaaaaaaaa Aaaaaaaaaaaa Aaa Aaaaaa Aaaaa Aaaaaaaa Aaaaaaaaa Aaaaaa Aaa Aaaaaaaaa Aaaaaaaa Aa Aaaaaaaaaa Aaaaaaaa Aaaa Aaa Aaaaaaaaaaa Aaaaaaaaaaaaaa Aaa Aaaaaa Aaaaaaaaaa Aa Aaaaaaaa
a Aaaaaaaaaaaaaaaaaaaaaaaaa Aaaaaaa Aa Aaaaaaaa Aaaaaaaaaa Aaaaaaa Aaaaaaa Aaaaaaaaaaaaaaaaaa Aaaa Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Aaa Aaaaaaaaaaaaaaaaaaaaaaaaaaaa Aaa Aaaaaaaaaaaaaaaaaaaaa Aaa Aaaaaaaaaaaa Aa Aaaa Aaaaa Aaa Aaaaaa Aaaa Aaaaa Aaaa Aaaaaa Aaaaa Aaaaaaaaaaaa Aa Aa Aaaaaaaa Aaaaaaaaaaaaa
a Aaaaaaaaaaa Aaa Aaaaaaaaaaaaaaaaa Aaaaa Aaaaaaa Aaaaaaa Aaaa Aaaaaaaaaaaaaaaaaaa Aa Aaa Aaaa Aaaa Aaaa Aaaaaaaa Aaaa Aaaaaaaa Aaaaaaaa Aaa Aaaa Aaa Aaaa Aaaaaaaaaaa Aaaa Aaaa Aaa Aaaaa Aaaaaaaaa Aaaaaa Aa Aaaaaaaaaaaaaa Aaaaaaaaaaa Aaaaaaa Aa Aaaaaaaaaa Aaaaaaaaaaaaaaaaa Aaaaa
🔒Multiple reusable audit patterns identified across JSC's lazy-compilation and compiler-thread surfaces, with concrete grep targets for variant discovery.
Subscribe to read more