XML Parser Billion-Laughs Mitigation via Deferred Entity Registration
d093eba
Source/WebCore/xml/parser/XMLDocumentParserLibxml2.cpp
+void XMLDocumentParser::entityDecl(const xmlChar* name, int type, const xmlChar* publicId, const xmlChar* systemId, xmlChar* content)
+{
+ auto contentString = toString(content);
+ if (type == XML_INTERNAL_GENERAL_ENTITY && contentString.contains('&')) {
+ m_deferredEntityDeclarations.append({...});
+ m_deferredEntityDeclarationsFlushed = false;
+ return;
+ }
+ xmlSAX2EntityDecl(context(), name, type, publicId, systemId, content);
+}
+
+void XMLDocumentParser::flushDeferredEntityDeclarations()
+{
+ m_deferredEntityDeclarationsFlushed = true;
+ m_entityTransitiveReferenceCounts.clear();
+ ...
+ for (auto& decl : m_deferredEntityDeclarations) {
+ HashSet<AtomString> visiting;
+ if (computeTransitiveEntityReferenceCount(entityName, entityContents, visiting) > m_maxEntityExpansionCount) {
+ handleError(XMLErrors::Type::Fatal, "Entity reference expansion limit reached", textPosition());
+ return;
+ }
+ xmlSAX2EntityDecl(context(), ...);
+ }
+}
This fixes a billion-laughs (exponential entity expansion) hang in WebKit's XML parser. Instead of registering internal entities with libxml2 immediately via xmlSAX2EntityDecl, the fix defers registration until all declarations are collected, then computes a transitive entity reference count via memoized recursion. Entities whose transitive expansion exceeds 512 are never registered, causing a parse error instead. This was necessary because libxml2's own mitigation API (xmlCtxtSetMaxAmplification) is too new for the macOS-shipped libxml2 version.
Before (synchronous registration):
DTD parsing
└─► entityDecl SAX callback
└─► xmlSAX2EntityDecl() ──► registered, libxml2 can expand recursively
After (deferred with transitive count):
DTD parsing → append to m_deferredEntityDeclarations[]
Entity reference → flushDeferredEntityDeclarations()
└─► computeTransitiveEntityReferenceCount() [memoized]
├── count ≤ 512: register with xmlSAX2EntityDecl()
└── count > 512: fatal parse error
The critical complication: libxml2's xmlParseBalancedChunkMemoryInternal creates nested parser contexts that bypass the SAX getEntity callback and call xmlGetDocEntity directly — so any guard placed solely in the SAX layer is circumvented if the entity is already registered with the document. The deferred registration approach ensures dangerous entities are never registered at all.
Significance
This was a DoS-class hang reachable via any XML document (XHTML, SVG, MathML) loaded in WebKit — no interaction required beyond loading the page.
Audit directions
Aaa Aaa Aaaaa Aaaaaa Aaaaaaa Aaaaaaa Aaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaa Aaaaaaa Aaa Aa Aaa Aaa Aaaaaaaaaaa Aaaaaaaaaa Aaaaaaa Aaaaaaaaa Aaaaa Aa Aaa Aaaaaaa Aaaaaa Aaa Aaaaa Aaaaaa Aaaaaaaaa Aaaaaa Aaaaa Aaaaa Aaaaaa Aaaaaaaaaaa Aaaaaaa Aaaaaaaaa Aaaaaaaaaa Aaaa Aaaaaa Aa Aaaa Aa Aaaaaaaaaaaaaaaaaaaaaa Aaaaaaaaaaa Aa Aaaaaaaa Aaaaa Aaaaaaa Aaaaaaaaaa Aaa Aaaaaaa Aa Aaa Aaaaaaaaaa Aaaaaa Aaa Aaaaa Aaaaa Aaaaaaa a Aaa Aaaaaa Aaaaaaaaaa a Aaaaaa Aaaa Aaaaaa Aa Aaaaaaaaa Aa Aaaaa a Aaaaaa Aaaaaa Aaaaa Aaaaa Aaaa Aaaaaaaaa Aa Aaaaa Aaa Aaaaaaaa Aaaaa Aaaaa Aaaaa Aaaa Aaaaaaaaa Aaaaaaaa Aaaaaaaaaa Aaaa Aaaaaaaaa Aaaaaaaaa Aaaaa Aaa Aaaaaa Aaa Aaaaaaaa Aaaa Aaaaaaaa a Aaaaa Aaaaaaa Aaaa Aaaa a Aaaaaaaa Aaaaaa
🔒Several edge cases in the new entity counting logic and its interaction with libxml2's own expansion paths are worth security investigation.
Subscribe to read more