← All issues

XML Parser Billion-Laughs Mitigation via Deferred Entity Registration

d093eba

Source/WebCore/xml/parser/XMLDocumentParserLibxml2.cpp

+void XMLDocumentParser::entityDecl(const xmlChar* name, int type, const xmlChar* publicId, const xmlChar* systemId, xmlChar* content)
+{
+ auto contentString = toString(content);
+ if (type == XML_INTERNAL_GENERAL_ENTITY && contentString.contains('&')) {
+ m_deferredEntityDeclarations.append({...});
+ m_deferredEntityDeclarationsFlushed = false;
+ return;
+ }
+ xmlSAX2EntityDecl(context(), name, type, publicId, systemId, content);
+}
+
+void XMLDocumentParser::flushDeferredEntityDeclarations()
+{
+ m_deferredEntityDeclarationsFlushed = true;
+ m_entityTransitiveReferenceCounts.clear();
+ ...
+ for (auto& decl : m_deferredEntityDeclarations) {
+ HashSet<AtomString> visiting;
+ if (computeTransitiveEntityReferenceCount(entityName, entityContents, visiting) > m_maxEntityExpansionCount) {
+ handleError(XMLErrors::Type::Fatal, "Entity reference expansion limit reached", textPosition());
+ return;
+ }
+ xmlSAX2EntityDecl(context(), ...);
+ }
+}

This fixes a billion-laughs (exponential entity expansion) hang in WebKit's XML parser. Instead of registering internal entities with libxml2 immediately via xmlSAX2EntityDecl, the fix defers registration until all declarations are collected, then computes a transitive entity reference count via memoized recursion. Entities whose transitive expansion exceeds 512 are never registered, causing a parse error instead. This was necessary because libxml2's own mitigation API (xmlCtxtSetMaxAmplification) is too new for the macOS-shipped libxml2 version.

Before (synchronous registration):
  DTD parsing
    └─► entityDecl SAX callback
          └─► xmlSAX2EntityDecl() ──► registered, libxml2 can expand recursively

After (deferred with transitive count):
  DTD parsing → append to m_deferredEntityDeclarations[]
  Entity reference → flushDeferredEntityDeclarations()
    └─► computeTransitiveEntityReferenceCount() [memoized]
          ├── count ≤ 512: register with xmlSAX2EntityDecl()
          └── count > 512: fatal parse error

The critical complication: libxml2's xmlParseBalancedChunkMemoryInternal creates nested parser contexts that bypass the SAX getEntity callback and call xmlGetDocEntity directly — so any guard placed solely in the SAX layer is circumvented if the entity is already registered with the document. The deferred registration approach ensures dangerous entities are never registered at all.

This was a DoS-class hang reachable via any XML document (XHTML, SVG, MathML) loaded in WebKit — no interaction required beyond loading the page.

🔒

Several edge cases in the new entity counting logic and its interaction with libxml2's own expansion paths are worth security investigation.

Subscribe to read more