C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion

Yuval Haitman Amit Efraim Joseph M. Francos

Ben-Gurion University, Beer-Sheva, Israel

CVPR 2026

C-GenReg enables training-free 3D registration by transforming point clouds into multi-view consistent RGB images, allowing Vision Foundation Model features to augment conventional 3D geometric features for more robust registration.

Abstract

We introduce C-GenReg, a training-free framework for 3D point cloud registration that leverages the complementary strengths of world-scale generative priors and registration-oriented Vision Foundation Models (VFMs). Current learning-based 3D point cloud registration methods struggle to generalize across sensing modalities, sampling differences, and environments. C-GenReg augments the geometric registration branch by transferring the matching problem into an auxiliary image domain, where VFMs excel, using a World Foundation Model to synthesize multi-view-consistent RGB representations from the input geometry.

This generative transfer preserves spatial coherence across source and target views without any fine-tuning. From these generated views, a VFM pretrained for dense correspondences extracts matches, which are lifted back to 3D via the original depth maps. To further enhance robustness, we introduce a Match-then-Fuse probabilistic cold-fusion scheme that combines the generated-RGB and geometric correspondence posteriors. This principled fusion preserves each modality's inductive bias and provides calibrated confidence without any additional learning. C-GenReg is zero-shot and plug-and-play, and experiments on 3DMatch, ScanNet, and Waymo demonstrate strong zero-shot performance, superior cross-domain generalization, and successful operation on real outdoor LiDAR data where imagery is unavailable.

C-GenReg teaser figure showing the generated-RGB branch, geometric branch, and probabilistic fusion. — ***C-GenReg*** operates with a generated-RGB branch, a geometric branch, and a probabilistic fusion stage that combines both correspondence sources before estimating the final rigid transformation.

Method

Overview of the C-GenReg pipeline. — **Method overview.** The generated-RGB branch and the geometric branch produce independent correspondence posteriors that are fused by the Match-then-Fuse module before pose estimation.

1. Generated-RGB Branch

Source and target point clouds are represented as depth-frame sequences, temporally concatenated, and processed by a frozen World Foundation Model to generate RGB views that are geometrically coherent with the input depth and cross-view consistent. A subset of K frames per domain is then fed to a frozen, task-specific VFM, and the resulting dense pixel features are lifted back to 3D using the original depth maps.

2. Geometric Branch

In parallel, the raw point clouds are processed by a pretrained geometric feature extractor that produces dense geometric descriptors directly in 3D. This branch preserves the registration-oriented geometric cues that complement the generated-image representation.

3. Match-then-Fuse Probabilistic Fusion

Each modality produces a posterior correspondence map, p_img and p_geo, from its similarity scores. These are fused by the proposed Match-then-Fuse probabilistic module into a unified posterior p_fuse, from which the final rigid transformation is estimated.

Virtual camera projection for adapting LiDAR scans into the depth-image input format. — **LiDAR adaptation.** For outdoor LiDAR data, C-GenReg first converts each raw point cloud into a depth-image representation by projecting it onto a virtual camera. This depth image can then be fed into the same World Foundation Model used for indoor data, allowing the method to generate an aligned RGB view and apply the same downstream registration pipeline on LiDAR data.

Results

3DMatch Benchmark

C-GenReg achieves the best overall performance across most reported indoor metrics while remaining fully zero-shot.

ScanNet Benchmarks

Cross-dataset generalization on unseen indoor scenes, including the ScanNet Hard and ScanNet SuperGlue splits.

Table 2 from the paper showing ScanNet benchmark results.

Waymo Outdoor Benchmark

Real outdoor LiDAR registration where C-GenReg substantially outperforms geometric baselines trained on KITTI.

Ablation Studies

Task-specific VFMs outperform general-purpose ones, and probabilistic fusion consistently improves over simple concatenation across geometric backbones.

Table 4 from the paper showing ablation study results on 3DMatch.

Prompt robustness figure from the paper.

Figure 6 from the paper showing the effect of view selection K.

Qualitative Results

Outdoor Registration on Waymo

Generated source and target images are shown with a subset of color-coded matched keypoints, followed by the same correspondences on the source and target 3D point clouds. The reported RRE and RTE highlight strong qualitative performance on outdoor LiDAR data.

Figure 10 example 1 from the paper showing qualitative registration results on the Waymo dataset. — Example 1

Figure 10 example 2 from the paper showing qualitative registration results on the Waymo dataset. — Example 2

Citation

@article{haitman2026cgenreg,
  title     = {C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion},
  author    = {Haitman, Yuval and Efraim, Amit and Francos, Joseph M.},
  journal   = {arXiv preprint arXiv:2604.16680},
  year      = {2026},
  doi       = {10.48550/arXiv.2604.16680},
  url       = {https://arxiv.org/abs/2604.16680}
}

C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion

Abstract

Method

1. Generated-RGB Branch

2. Geometric Branch

3. Match-then-Fuse Probabilistic Fusion

Results

3DMatch Benchmark

ScanNet Benchmarks

Waymo Outdoor Benchmark

Ablation Studies

Prompt Robustness

Effect of View Selection

Qualitative Results

Outdoor Registration on Waymo

Indoor Registration Results

Generated RGB Views Across Datasets

3DMatch

ScanNet

Waymo

Citation