Dynamic Novel View Synthesis

Given synchronized multi-view videos from 13 cameras, the task is to replay the same facial action performance from the 3 hold-out viewpoints. This requires reconstructing a 4D representation that plausibly models both geometry and complex motion. The challenge is conducted on 5 short sequences from different individuals. The sequences cover complex dynamic effects such as topological changes when the tongue sticks out, flying hair, dynamic wrinkle changes, as well as light refraction and reflection at glasses. To capture even subtle movements, the sequences have been recorded at 73fps.

Please see the NeRSemble Benchmark Toolkit on how to obtain the data to participate in the benchmark and prepare a submission.

Evaluation and Metrics

We evaluate the similarity between the ground truth and generated RGB videos. Our image-wise evaluation metrics are peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and learned perceptual image patch similarity (LPIPS). Additionally, we measure temporal consistency with FovVideoVDP (JOD) that is sensitive to flickering, noise and other temporal artifacts. For each pair of generated and ground-truth videos, we compute these four metrics and average the result for all 3 hold-out cameras and for all 5 individuals. To save compute during metric evaluation, the video metric JOD is evaluated at half-resolution at 24.3fps and the image metrics are evaluated every 10th frame.

Evaluation is carried out on GT images with resolution 1604x1100.

Results

Methods	PSNR	SSIM	LPIPS	JOD
TaoAvatar	27.827	0.875	0.195	7.613
Jianchuan Chen, Jingchuan Hu, Gaige Wang, Zhonghua Jiang, Tiansong Zhou, Zhiwen Chen, Chengfei Lv. TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting. CVPR 2025
4DGaussians	27.711	0.865	0.270	7.613
Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang. 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering. CVPR 2024
HeteroAvatar	26.461	0.854	0.254	7.167
Ahmet Cagatay Seker, Jun Seok Kang, Sang Chul Ahn. HeteroAvatar: Generation of Gaussian Head Avatars With Correct Geometry Using Hetero Rendering. IEEE Access
Deformable 3DGS	26.377	0.858	0.293	7.081
Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, Xiaogang Jin. Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction. CVPR 2024
TaylorGaussian	25.660	0.845	0.340	6.767
Bingbing Hu1, Yanyan Li, Rui Xie, Bo Xu, Haoye Dong, Junfeng Yao, Gim Hee Lee. Learnable Infinite Taylor Gaussian for Dynamic View Rendering. CVPR 2025

Please refer to the submission instructions before making a submission

Submit results