← All issues

GPU texture atlas pipeline for Skia painting on GTK/WPE

48bf47f

Source/WebCore/platform/graphics/skia/SkiaGPUAtlas.cpp

+bool SkiaGPUAtlas::uploadImages()
+{
+ auto pixelDataInSRGB = [&conversionBuffer](const sk_sp<SkImage>& image) -> std::optional<std::pair<const void*, size_t>> {
+ SkPixmap pixmap;
+ if (!image->peekPixels(&pixmap))
+ return std::nullopt;
+ if (auto* colorSpace = image->colorSpace(); !colorSpace || colorSpace->isSRGB())
+ return std::pair { pixmap.addr(), pixmap.rowBytes() };
+ // Convert to sRGB...
+ ...
+ };
+#if USE(GBM)
+ if (auto* gpuBuffer = m_atlasTexture->memoryMappedGPUBuffer()) {
+ if (gpuBuffer->isLinear() || gpuBuffer->isVivanteSuperTiled()) {
+ // DMA-buf fast path
+ ...
+ }
+ }
+#endif
+ // GL fallback
+ for (const auto& entry : m_layout.entries()) {
+ if (auto pixels = pixelDataInSRGB(entry.rasterImage))
+ m_atlasTexture->updateContents(pixels->first, entry.atlasRect, IntPoint::zero(), pixels->second, PixelFormat::BGRA8);
+ }
+}

The Skia painting engine on GTK/WPE uses a record/replay model: drawing commands are first recorded into a picture, then replayed on worker threads using GPU acceleration. This commit introduces a batching optimization where raster (CPU-side) images encountered during recording are packed into a single GPU texture atlas, uploaded once, and then used as atlas sub-regions during replay — avoiding per-image GPU uploads.

The pipeline has three stages. During recording, collectRasterImage() packs images into atlas layouts. After recording, createAtlas() uploads pixel data to a GPU texture — either via DMA-buf (memory-mapping a GBM buffer for direct CPU writes, dispatched to a worker thread) or via GL fallback (updateContents(), synchronous on main thread). During replay, SkiaReplayCanvas virtual overrides (onDrawImage2, onDrawImageRect2) intercept raster image draws and redirect them to atlas texture draws with coordinate remapping. Synchronization between the upload worker and replay workers is handled via AtlasUploadCondition, a custom countdown latch built on Lock and Condition.

Recording (main)          Upload (main or worker)     Replay workers
│                              │                           │
├─ collectRasterImage() ──────►│                           │
├─ finalize() → AtlasLayouts   │                           │
├─ createAtlas() ─────────────►│                           │
│   DMA-buf: mmap GPU buf      ├─ addPending()/signal() ──►│
│   GL: updateContents()       │                           │
│                              ├─ GL fence inserted ───────►│
│                              │                           ├─ wait() [latch]
│                          complete                        ├─ onDrawImage2()
│                                                          │  └─ atlas draw
│                                                          │     (remapped coords)

This is a substantial new rendering pipeline — GPU texture batching, cross-thread synchronization, DMA-buf memory-mapped writes, and atlas coordinate substitution — all interacting under concurrent workers, making it historically fertile ground for race conditions and lifetime bugs.

🔒

New cross-thread upload path and atlas coordinate substitution — several edge cases in the synchronization and image lookup logic are worth auditing.

Subscribe to read more