Master's Thesis · TU Munich · 2026

ConvexAvatars

ConvexAvatars models the human head directly with dynamic 3D convex primitives rigged to a FLAME expression model. Surface normals are computed at render time via ray-triangle intersection, giving geometry-grounded normals that enable physically-based relighting.

RGB RenderFinal relighted output
Primitive GeometryConvex hull structure per frame
Surface NormalsComputed via ray-triangle intersection

Primitive Normals via Ray-Triangle Intersection

For each pixel, a camera ray is cast and intersected against the convex hull of each contributing primitive. The hit triangle's face normal is accumulated as a weighted sum across primitives, weighted by their opacity. This gives geometry-grounded normals with no view-direction conditioning and with gradients flowing back to the vertex positions during training.

Primitive normals diagram: ray intersection, per-primitive rendering passes, and normal accumulation

Robust to Limited Training Data

Training on a single expression sequence (SEN-01) instead of the full dataset costs ConvexAvatars only 0.16 dB PSNR, versus 1.20 dB for RGCA and 0.99 dB for BecomingLit. Because normals come from geometry rather than a learned decoder, the network does not need to see a wide distribution of lighting conditions to produce meaningful surface information. This suggests the approach is well-suited for low-data capture settings where multi-view OLAT rigs are unavailable.

Method Training PSNR↑ SSIM↑ LPIPS↓
RGCA [Sai+24] Full 24.790.83940.1644
SEN-01 23.590.80220.1964
Δ -1.20-0.037+0.032
BecomingLit [SGN25] Full 27.100.85450.1524
SEN-01 26.110.83630.1578
Δ -0.99-0.018+0.005
Ours (ConvexAvatars) Full 26.310.87180.2133
SEN-01 26.150.86430.1983
Δ -0.16-0.008-0.015

[Sai+24] Saito et al., "Relightable Gaussian Codec Avatars," CVPR 2024.   [SGN25] Schmidt, Giebenhain & Niessner, "BecomingLit," NeurIPS 2025.