← All issues

[JSC] Implement `String#split` in C++

1a01603

Source/JavaScriptCore/builtins/BuiltinNames.h

- macro(stringSplitFast) \

Source/JavaScriptCore/builtins/StringPrototype.js

-function split(separator, limit)
-{
- "use strict";
- // ... entire JS builtin deleted; replaced by C++ host function
-}

Source/JavaScriptCore/runtime/RegExpObjectInlines.h

+inline bool RegExpObject::isSymbolSplitFastAndNonObservable()
+{
+ // mirrors isSymbolReplaceFastAndNonObservable:
+ // checks structure to rule out per-instance @@split override
+ // callers must also validate m_stringSymbolSplitWatchpointSet
+ // and m_regExpSpeciesWatchpointSet are still intact
+}

JSC uses two tiers of optimization for ECMAScript builtins: self-hosted JS builtins (compiled to bytecode, then JIT'd generically) and DFG intrinsics (type-specialized nodes emitting direct C++ calls). Watchpoints are invalidation hooks: optimized code assumes invariants (e.g., RegExp.prototype[@@split] is unmodified) and registers a watchpoint that fires when those invariants break.

This commit moves String.prototype.split from a self-hosted JS builtin to a C++ host function and introduces a new StringSplit DFG node. The node dispatches to operationStringSplit for string separators or operationStringSplitRegExp for primordial RegExp separators. Three new watchpoint sets guard the fast paths: m_stringSymbolSplitWatchpointSet, m_regExpSpeciesWatchpointSet, and splitSymbol added to regExpPrimordialPropertiesWatchpointSet. Per-instance overrides are caught by a structure check inside the new isSymbolSplitFastAndNonObservable, which mirrors the existing isSymbolReplaceFastAndNonObservable.

String#split(sep) call
        |
  [DFG StringSplit node]
        |
  watchpoints valid?
   yes / no
   |
   sep type check
   |- String  --> operationStringSplit
   |- RegExp? --> isSymbolSplitFastAndNonObservable
                  |- yes --> operationStringSplitRegExp (C++ fast)
                  |- no  --> RegExp.prototype[@@split]  (JS, observable)

The new fast-path gate adds three watchpoints and a structure check that mirror the historical bug surface of String#replace's analogous C++ migration — the same shape that has produced repeat watchpoint-race and structure-check bypass classes. An 11–23% speedup on split-heavy workloads is the headline, but every new watchpoint-guarded fast path opens a fresh window for observable side-effect leaks if the gate's invariants aren't fully preserved.

🔒

New fast-path gates and watchpoint sets for a security-sensitive string operation — the guard boundaries and invalidation ordering have audit-worthy edge cases.

Subscribe to read more