[5] ANGLE Metal uint32_t Overflow in Compute Dispatch
Severity: Medium | Component: ANGLE Metal backend | c8f9e83
Observable effect는 Metal validation을 통한 GPU process crash로, web content에서 발생하는 DoS에 해당합니다. 다만 Metal의 자체 validation layer가 overflow를 감지하여 GPU memory corruption이 발생하기 전에 차단하기 때문에, denial-of-service를 넘는 escalation은 불가능합니다. 이런 이유로 Medium으로 평가하였습니다.
Patch Details
mtl_render_utils.mm에 SafeThreadgroupWidth()가 추가되었습니다. 이 함수는 numGroups * threadgroupWidth가 uint32_t를 초과하지 않는 범위에서 가장 큰 2의 거듭제곱 threadgroup width를 탐색합니다. 기존 1D DispatchCompute overload는 threadgroup width를 단순 std::clamp로 계산했으며, threadgroup 경계로 올림 처리한 뒤 overflow 검증이 없었습니다.
Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/mtl_render_utils.mm
+NSUInteger SafeThreadgroupWidth(size_t numThreads, NSUInteger maxWidth)
+{
+ NSUInteger threadgroupWidth = std::clamp<NSUInteger>(numThreads, 1u, maxWidth);
+ while (threadgroupWidth > 1)
+ {
+ size_t numGroups = (numThreads + threadgroupWidth - 1) / threadgroupWidth;
+ size_t roundedThreads = numGroups * threadgroupWidth;
+ if (roundedThreads <= std::numeric_limits<uint32_t>::max())
+ break;
+ threadgroupWidth >>= 1;
+ }
+ return threadgroupWidth;
+}
- NSUInteger w = std::clamp<NSUInteger>(numThreads, 1u, pipelineState.threadExecutionWidth);
- MTLSize threadsPerThreadgroup = MTLSizeMake(w, 1, 1);
+ NSUInteger safeWidth = SafeThreadgroupWidth(numThreads, pipelineState.threadExecutionWidth);
+ MTLSize threadsPerThreadgroup = MTLSizeMake(safeWidth, 1, 1);
Source/ThirdParty/ANGLE/src/tests/gl_tests/VertexAttributeTest.cpp
+TEST_P(VertexAttributeTestES3, MaxSizeBufferWithByteConversionDoesNotCrash)
+{
+ ...
+ const GLuint maxSize = 0xFFFFFFFC; // Near UINT32_MAX, aligned to 4 bytes
+ ...
+ glVertexAttribPointer(positionLocation, 4, GL_BYTE, GL_FALSE, 1, nullptr);
+ glDrawArrays(GL_TRIANGLES, 0, 3);
+ ASSERT_GL_NO_ERROR();
+}
Metal GPU API에 전달되는 threadgroup 올림 처리 후 dispatch size에서 발생하는 unsigned integer overflow.
Background
ANGLE의 Metal backend는 Metal compute shader를 이용해 vertex attribute format 변환을 수행합니다. vertex attribute의 format이나 stride가 Metal native vertex format과 맞지 않으면, ANGLE은 compute shader를 dispatch하여 변환을 처리합니다. 이때 compute dispatch는 작업을 threadgroup 단위로 나누는데, 전체 thread 수는 변환할 vertex 수에 해당하고 dispatch는 threadgroup-width 경계에 맞춰 올림 처리를 수행합니다. Metal의 thread_position_in_grid shader builtin은 uint32_t 타입으로 정의되어 있으며(코드 주석 기준), Metal runtime은 올림된 grid 전체 크기가 이 32-bit 한계 내에 들어오는지를 검증합니다.
Analysis
패치 이전에는 DispatchCompute가 threadgroup width를 std::clamp<NSUInteger>(numThreads, 1u, pipelineState.threadExecutionWidth)로 계산한 뒤, group 수를 (numThreads + w - 1) / w로 산출했습니다. numThreads가 UINT32_MAX에 근접한 매우 큰 값이면 numGroups * threadgroupWidth의 곱이 UINT32_MAX를 초과하게 됩니다. vertex format 변환 compute path는 numThreads를 vertex 수에서 가져오는데, 이 값은 buffer 크기를 stride로 나누어 산출합니다. GL_BYTE format에 stride=1로 near-max-size buffer를 사용하면 numThreads가 UINT32_MAX에 근접하게 되어 overflow가 발생합니다.
Aaaa Aaaaa Aaa Aaaaaaa Aaa Aaaaaa Aaaaaaaaaaaa Aaa Aaa Aaaaa Aaaaaaa Aaaaa Aaaaaaaaa Aaa Aaaaaaaaa Aaaaaa Aaaaaaaaaa Aaaa a Aaaa Aaaaa Aaaa Aaaa Aaaaaa Aaaaaaaaaaa Aaaaaaaaa Aaaa Aaa Aaaaaaaa Aaaaaaa
a Aaaaaaaaaaaaaa Aaaaaa Aa Aaaaaaaa Aaaa Aaa Aaaaaaa Aaaaaa Aaaaaaaaaa Aaaaaa Aaaaaaaaa Aaaa Aaaaaaaaa Aaaaaaa Aa Aaaaaa Aaaaaa Aaaaaaaaaaa Aa Aaa Aaaaaaaaa Aaaaaaa Aaaaaaaa Aaaaaa
a Aaaaaaaaaaaaaa Aaa Aaaaaaaa Aaaa Aaaaaaa Aaaaaaaa Aaaa Aaaaaaaaaa Aaaaaaaaaaaaa Aaaa Aaa Aaaaa Aaaaaa Aaaaaaa Aaaaa Aaaa Aaa Aaaaaaaaa Aaa Aaaaaaaa Aaaaa Aaaaaaa a Aaaaa
Aaa Aaaaaaaa Aaaaaaaaa Aaaa Aaaa Aaaaaaa Aaaaaaaaa Aaaaaaaa Aaaaaaaaa Aaaaa Aaaa Aaaaaa a Aaaa Aaaaaa Aaaaaaaaaa Aaaaaa Aaaaaaaaa Aaaaaa Aa Aaaaaaaa Aaaaa Aa Aaa Aaaaaaaaaaaaaaa Aaaaaaaaaa Aaa Aaaa Aaa Aaaaaaaaaaa Aa Aaaa Aaaaa a Aa Aaa Aaaaaa Aa Aa Aaa Aaa Aaaaaa Aaaaaaa Aaa Aaaa Aaaaa
🔒Explores the overflow mechanism in Metal compute dispatch and whether the crash could escalate beyond denial-of-service
더 확인하려면 구독해 주세요
Audit directions
a Aaaaaa Aaaa Aaaaaa Aaaaaaaaa Aaa Aa Aaaa Aaa Aaaaaaaa Aaaa Aaaa Aaaaaaaa Aaa Aaa Aaaaa Aaa Aaaaaaa a Aaaaaaaaa a Aa a Aaaaaaaaa a Aaaaaaaaaa Aaaaa Aaaaaa Aaaa Aaa Aaaaa Aaaaa Aaaaa Aaaaaaaa Aa Aaaaaaaaaaaaaaaaa Aa Aaaaaaaaaaaaaaaaaaaaaaaaaa Aaaaaa Aaaaaaaa Aaaaaaaaaaaaaaa Aaaa Aaa Aa Aaa Aaaa Aaaa Aaaaa Aaaaa Aaaa Aaaaaaaaaaaaaaaaaaaaaaa Aaaaaaaaaaaaaa Aaaa Aa Aaaa Aaaa
a Aaaaaaaaa Aa a a Aa Aaaa Aaaaaaaaaaa a Aaaaaa Aaaa Aaaaaa Aaaaaa Aaaa Aaaaa Aaaaaaa Aaaaaa Aaaaaa Aaaa Aaa Aaaa Aaaaaa Aaaaaa Aaaaa Aa a Aaaaa Aaaaaa Aa Aaaaaa Aa Aaaaa Aa Aaaaaaaaaaaaaaaaaaaaa a Aaaaaaaaaaaaaaaaa Aa Aaaaa Aaaaaaaaaaaa Aaaaaaaa Aaa Aaaa Aa Aaa Aaaaaa Aaaaaa Aaaaa Aaaa Aaaa
🔒Multiple audit patterns identified for GPU dispatch overflow and vertex count derivation across ANGLE backends
더 확인하려면 구독해 주세요