ScanNet++ Novel View Synthesis and 3D Semantic Understanding Challenge

CVPR 2025 Workshop, Nashville TN

Date to be decided


teaser

Introduction

Recent advances in generative modeling and semantic understanding have spurred significant interest in synthesis and understanding of 3D scenes. In 3D, there is significant potential in application areas, for instance augmented and virtual reality, computational photography, interior design, and autonomous mobile robots all require a deep understanding of 3D scene spaces. We propose to offer the first benchmark challenge for novel view synthesis in large-scale 3D scenes, along with high-fidelity, large-vocabulary 3D semantic scene understanding -- where very complete, high-fidelity ground truth scene data is available. This is enabled through the new ScanNet++ dataset, which offers 1mm resolution laser scan geometry, high-quality DSLR image capture, and dense semantic annotations over 1000 class categories. In particular, existing view synthesis leverages data captured from a single continuous trajectory, where evaluation of novel views outside of the original trajectory capture is impossible. In contrast, our novel view synthesis challenge leverages test images captured intentionally outside of the train image trajectory, allowing for comprehensive evaluation of methods to test new, challenging scenarios for state-of-the-art methods.

📢 New this year 📢 ScanNet++ v2 released with 1000+ scenes, more scene types, improved annotations and poses. Check it out!


Schedule

Welcome and Introduction 13:30 - 13:45
Invited Talk 1 13:45 - 14:20
Invited Talk 2 14:20 - 14:55
Winner Talks: TBD 14:55 - 15:40
Invited Talk 3 15:40 - 16:15
Invited Talk 4 16:15 - 16:50
Panel Discussion and Conclusion 16:50 - 17:30


Invited Speakers

Cordelia Schmid is a research director at Inria. She holds a M.S. degree in Computer Science from the University of Karlsruhe and a Doctorate, also in Computer Science, from the Institut National Polytechnique de Grenoble (INPG). Her doctoral thesis on "Local Greyvalue Invariants for Image Matching and Retrieval" received the best thesis award from INPG in 1996. She received the Habilitation degree in 2001 for her thesis entitled "From Image Matching to Learning Visual Models". Dr. Schmid is a member of the German National Academy of Sciences, Leopoldina and a fellow of IEEE and the ELLIS society. She was awarded the Longuet-Higgins prize in 2006, 2014 and 2016, the Koenderink prize in 2018 and the Helmholtz price in 2023, all for fundamental contributions in computer vision that have withstood the test of time. She received an ERC advanced grant in 2013, the Humbolt research award in 2015, the Inria & French Academy of Science Grand Prix in 2016, the Royal Society Milner award in 2020 and the PAMI distinguished researcher award in 2021. In 2023 she received the Korber European Science Prize. Dr. Schmid has been an Associate Editor for IEEE PAMI (2001- 2005) and for IJCV (2004–2012), an editor-in-chief for IJCV (2013–2018), a program chair of IEEE CVPR 2005 and ECCV 2012 as well as a general chair of IEEE CVPR 2015, ECCV 2020 and ICCV 2023. Starting 2018 she holds a joint appointment with Google research.

Andrea Vedaldi is a Professor of Computer Vision and Machine Learning and a co-lead of the VGG group at the Engineering Science department of the University of Oxford. He researches computer vision and machine learning methods to understand the content of images and videos automatically, with little to no manual supervision, in terms of semantics and 3D geometry.

Gordon Wetzstein is an Associate Professor of Electrical Engineering and, by courtesy, of Computer Science at Stanford University. He is the leader of the Stanford Computational Imaging Lab and a faculty co-director of the Stanford Center for Image Systems Engineering. At the intersection of computer graphics and vision, artificial intelligence, computational optics, and applied vision science, Prof. Wetzstein's research has a wide range of applications in next-generation imaging, wearable computing, and neural rendering systems. Prof. Wetzstein is a Fellow of Optica and the recipient of numerous awards, including an IEEE VGTC Virtual Reality Technical Achievement Award, an NSF CAREER Award, an Alfred P. Sloan Fellowship, an ACM SIGGRAPH Significant New Researcher Award, a Presidential Early Career Award for Scientists and Engineers (PECASE), an SPIE Early Career Achievement Award, an Electronic Imaging Scientist of the Year Award, an Alain Fournier Ph.D. Dissertation Award as well as many Best Paper and Demo Awards.

Andrea Tagliasacchi is an associate professor at Simon Fraser University (Vancouver, Canada) where he holds the appointment of Visual Computing Research Chair within the school of computing science. He is also a part-time (20%) staff research scientist at Google DeepMind (Toronto, Canada), as well as an associate professor (status only) in the computer science department at the University of Toronto. Before joining SFU, he spent four wonderful years as a full-time researcher at Google (mentored by Paul Lalonde, Geoffrey Hinton, and David Fleet). Before joining Google, he was an assistant professor at the University of Victoria (2015-2017), where he held the Industrial Research Chair in 3D Sensing (jointly sponsored by Google and Intel). His alma mater include EPFL (postdoc) SFU (PhD, NSERC Alexander Graham Bell fellow) and Politecnico di Milano (MSc, gold medalist). Several of his papers have received best-paper award nominations at top-tier graphics and vision conferences, and he is the recipient of the 2015 SGP best paper award, the 2020 CVPR best student paper award, and the 2024 CVPR best paper award (honorable mention). His research focuses on 3D visual perception, which lies at the intersection of computer vision, computer graphics and machine learning.

Angjoo Kanazawa is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. She leads the Kanazawa AI Research (KAIR) lab under BAIR. She also serves on the advisory board of Wonder Dynamics. Previously, Angjoo was a Research Scientist at Google Research, and BAIR postdoc at UC Berkeley advised by Jitendra Malik, Alexei A. Efros and Trevor Darrell. She completed her PhD in Computer Science at the University of Maryland, College Park with her advisor David Jacobs.

Organizers

Angela Dai
Technical University of Munich
Yueh-Cheng Liu
Technical University of Munich
Chandan Yeshwanth
Technical University of Munich
Matthias Niessner
Technical University of Munich