Professor
University of Cambridge, UK
Bio:Brian Sheil is the Director of the Centre for Smart Infrastructure and Construction at University of Cambridge. He previously held academic positions at the University of Oxford and was awarded a Royal Academy of Engineering Research Fellowship before moving to Cambridge in 2022 to take up the Laing O’Rourke Associate Professorship in Construction Engineering. In 2024, he was awarded an EPSRC Open Fellowship for his work on Digital Underground Construction. He is a co-founder and Chief Scientist of the startup InfraMind, which develops AI-based “digital inspectors” for infrastructure, currently being trialled by organisations including National Highways, Network Rail, and Transport for London. He serves on the editorial boards of several leading journals and has received multiple awards for research excellence and commercialisation. His research puts physics first in developing trustworthy AI for infrastructure, combining computer vision, multimodal sensing, physics-informed learning, and multi-scale simulation to enable digital twins, inspection, early warning, and lifecycle management of underground and civil infrastructure.
Abstract:Critical infrastructure in the industrialised world is deteriorating faster than manual inspection can track. This talk describes our journey in building an AI-based 'Digital Inspector' for infrastructure using LiDAR (for surface assessment) and Ground Penetrating Radar (GPR) (for subsurface characterisation). A key challenge across both modalities is labelled data scarcity: the talk will describe how we tackle this through physics-informed simulation enabling large-scale synthetic pre-training without costly field campaigns. We focus on two canonical use cases, namely railway and metro tunnels and buried utility networks, where the consequences of missed defects are severe and inspection access is limited. I will show how these technologies are now being deployed at scale on major infrastructure projects across the UK. I will close by discussing our open problems in sim-to-real transfer, cross-modal fusion, and uncertainty quantification - challenges the CVPR community is well-placed to address.
Principal Research Manager
Microsoft, USA
Bio:Dr. Sarah Parisot is a Principal Research Manager in the People-Centric AI group at Microsoft Research Cambridge. Her research interests include data efficient learning, large vision-language models, and controllability of visual generative models, including interactive world models and their application to support creative ideation. Prior to joining Microsoft, she led the London AI Theory team at Huawei’s Noah’s Ark Lab. Earlier in her career, she was a Research Associate at Imperial College London, where she developed graph neural network methodologies for medical imaging. She earned her PhD in Applied Mathematics from INRIA and École Centrale Paris in 2013.
Abstract:World models offer a path toward interactive, co‑creative systems that support iteration, exploration, and sustained creative control. To be useful to creators, such models must balance expressiveness with practical constraints such as data efficiency, responsiveness, and inference cost. This talk explores the interplay between model design and creative intent, including how representation choices, efficiency techniques, and data strategy can shape creative use.
Professor
Technical University of Munich, Germany
Bio:Daniel Cremers is the Chair of Computer Vision and Artificial Intelligence at TU Munich. He is Director of the Munich Center for Machine Learning (MCML), the Munich Data Science Institute, and ELLIS Munich, and he serves as President of the European Computer Vision Association. His pioneering research, which has earned him the Gottfried Wilhelm Leibniz Prize (Germany’s highest research award) and five grants from the European Research Council (ERC), focuses on 3D reconstruction, visual SLAM, and neural scene representations. His landmark algorithms, including LSD-SLAM—which received the ECCV 2024 Koenderink Test of Time Award—and DSO, have significantly advanced the field of spatial AI. Beyond academia, Prof. Cremers is an active co-founder and advisor to numerous deep-tech startups, including Artisense, SE3 Labs, and Skydio.
Abstract:This keynote traces the evolution of motion estimation and 3D reconstruction over the past decade, beginning with foundational direct methods like LSD-SLAM and Direct Sparse Odometry (DSO). We examine how these methods, which minimize photometric rather than geometric errors, laid the groundwork for robust spatial AI. The talk details the transition to "deep" direct methods, such as DSVO and D3VO, which integrate self-supervised depth and uncertainty priors into the direct formulation, and MonoRec, which enables high-quality reconstruction in dynamic scenes. Finally, we introduce ViSTA-SLAM and its symmetric two-view association, illustrating how the integration of geometric principles with deep learning leads to state-of-the-art methods that compute dense and undistorted monocular reconstructions at 80fps.