Media Summary: Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality. Learning to Drive is a Free Gift: Large-Scale Label-Free Autonomy Pretraining from Unposed In-The-Wild Videos.
Cvpr 2026 Processmaker - Detailed Analysis & Overview
Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality. Learning to Drive is a Free Gift: Large-Scale Label-Free Autonomy Pretraining from Unposed In-The-Wild Videos. Hakyeong Kim, Ruicheng Wang, Chengtang Yao, Jiaolong Yang, Min H. Kim ( An overview of our paper, "SketchDeco: Training-Free Latent Composition for Precise Sketch Colourisation". Accepted in Tuna: Taming Unified Visual Representations for Native Unified Multimodal Models.
GOR-IS presents a 3D Gaussian object removal framework that edits scenes in the intrinsic space, enabling physically consistent ... Joonki Min, Chaeyun Kim, Hyungwook Choi, Yejin Kim, Kihyun Kim, Yohan Jo, Joonseok Lee. Fine-Grained Multi-Image Object ... How much do video diffusion models know about the 4D world? By introducing a 4D VAE, we jointly estimate geometry and ... Presentation for the paper: Raphael Maser*, Siddhartha Gairola*, Sukrut Rao, Bernt Schiele: Align Once to Explain: Feature ...