← All issues

[5] ANGLE Metal uint32_t Overflow in Compute Dispatch

Severity: Medium | Component: ANGLE Metal backend | c8f9e83

Rated Medium because the observable effect is a reliable GPU process crash via Metal validation (DoS from web content), but escalation beyond denial-of-service is blocked by Metal's own validation layer catching the overflow before any GPU memory corruption can occur.

The patch adds SafeThreadgroupWidth() in mtl_render_utils.mm, which finds the largest power-of-two threadgroup width where numGroups * threadgroupWidth does not overflow uint32_t. The existing 1D DispatchCompute overload previously computed threadgroup width via a simple std::clamp without checking for overflow after rounding up to threadgroup boundaries.

Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/mtl_render_utils.mm

+NSUInteger SafeThreadgroupWidth(size_t numThreads, NSUInteger maxWidth)
+{
+ NSUInteger threadgroupWidth = std::clamp<NSUInteger>(numThreads, 1u, maxWidth);
+ while (threadgroupWidth > 1)
+ {
+ size_t numGroups = (numThreads + threadgroupWidth - 1) / threadgroupWidth;
+ size_t roundedThreads = numGroups * threadgroupWidth;
+ if (roundedThreads <= std::numeric_limits<uint32_t>::max())
+ break;
+ threadgroupWidth >>= 1;
+ }
+ return threadgroupWidth;
+}
- NSUInteger w = std::clamp<NSUInteger>(numThreads, 1u, pipelineState.threadExecutionWidth);
- MTLSize threadsPerThreadgroup = MTLSizeMake(w, 1, 1);
+ NSUInteger safeWidth = SafeThreadgroupWidth(numThreads, pipelineState.threadExecutionWidth);
+ MTLSize threadsPerThreadgroup = MTLSizeMake(safeWidth, 1, 1);

Source/ThirdParty/ANGLE/src/tests/gl_tests/VertexAttributeTest.cpp

+TEST_P(VertexAttributeTestES3, MaxSizeBufferWithByteConversionDoesNotCrash)
+{
+ ...
+ const GLuint maxSize = 0xFFFFFFFC; // Near UINT32_MAX, aligned to 4 bytes
+ ...
+ glVertexAttribPointer(positionLocation, 4, GL_BYTE, GL_FALSE, 1, nullptr);
+ glDrawArrays(GL_TRIANGLES, 0, 3);
+ ASSERT_GL_NO_ERROR();
+}

Unsigned integer overflow in threadgroup-rounded dispatch size passed to Metal GPU API.

ANGLE's Metal backend converts vertex attributes between formats using Metal compute shaders. When a vertex attribute uses a format or stride that doesn't match the native Metal vertex format, ANGLE dispatches a compute shader to perform the conversion. The compute dispatch divides work into threadgroups: the total thread count equals the number of vertices to convert, and the dispatch rounds up to threadgroup-width boundaries. Metal's thread_position_in_grid shader builtin is typed as uint32_t (per the code comments), and the Metal runtime validates that the total rounded grid size fits within this 32-bit limit.

Before the fix, DispatchCompute computed threadgroup width as std::clamp<NSUInteger>(numThreads, 1u, pipelineState.threadExecutionWidth) and then calculated the number of groups as (numThreads + w - 1) / w. For very large numThreads values (near UINT32_MAX), the product numGroups * threadgroupWidth exceeded UINT32_MAX. The vertex format conversion compute path derives numThreads from the vertex count, which is determined by buffer size divided by stride. With GL_BYTE format and stride=1 on a near-max-size buffer, numThreads approaches UINT32_MAX, triggering the overflow.

🔒

Explores the overflow mechanism in Metal compute dispatch and whether the crash could escalate beyond denial-of-service

Subscribe to read more

🔒

Multiple audit patterns identified for GPU dispatch overflow and vertex count derivation across ANGLE backends

Subscribe to read more