Cvpr 2026 Processmaker

[CVPR 2026] ProcessMaker

ProcessMaker

Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement.

Project Page: https://aka.ms/task-transfer-vlms Paper: https://arxiv.org/abs/2511.18787.

MUST: Modality-Specific Representation-Aware Transformer for Diffusion-Enhanced Survival Prediction with Missing Modality.

Learning to Drive is a Free Gift: Large-Scale Label-Free Autonomy Pretraining from Unposed In-The-Wild Videos.

Video Presentation of

Hakyeong Kim, Ruicheng Wang, Chengtang Yao, Jiaolong Yang, Min H. Kim (

An overview of our paper, "SketchDeco: Training-Free Latent Composition for Precise Sketch Colourisation". Accepted in

Tuna: Taming Unified Visual Representations for Native Unified Multimodal Models.

GOR-IS presents a 3D Gaussian object removal framework that edits scenes in the intrinsic space, enabling physically consistent ...

Joonki Min, Chaeyun Kim, Hyungwook Choi, Yejin Kim, Kihyun Kim, Yohan Jo, Joonseok Lee. Fine-Grained Multi-Image Object ...

How much do video diffusion models know about the 4D world? By introducing a 4D VAE, we jointly estimate geometry and ...

Presentation for the paper: Raphael Maser*, Siddhartha Gairola*, Sukrut Rao, Bernt Schiele: Align Once to Explain: Feature ...