Abstract
Autonomous high-fidelity object reconstruction is fundamental for creating digital assets
and bridging the simulation-to-reality gap in robotics. We present ObjSplat, an
active reconstruction framework that leverages Gaussian surfels as a unified representation
to progressively reconstruct unknown objects with both photorealistic appearance and
accurate geometry. Addressing the limitations of conventional opacity or depth-based cues,
we introduce a geometry-aware viewpoint evaluation pipeline that explicitly models back-face
visibility and occlusion-aware multi-view covisibility, reliably identifying
under-reconstructed regions even on geometrically complex objects. Furthermore, to overcome
the limitations of greedy planning strategies, ObjSplat employs a next-best-path (NBP)
planner that performs multi-step lookahead on a dynamically constructed spatial graph. By
jointly optimizing information gain and movement cost, this planner generates globally
efficient trajectories. Extensive experiments in simulation and on real-world cultural
artifacts demonstrate that ObjSplat produces physically consistent models within minutes,
achieving superior reconstruction fidelity and surface completeness while significantly
reducing scan time and path length compared to state-of-the-art approaches.
System Overview
ObjSplat progressively reconstructs unknown objects from RGB-D frames using Gaussian surfels as a unified representation. (Top left) Incoming frames are fused into the global model, where geometry–texture joint optimization, enforcing both photometric and geometric consistency. (Right) A geometry-aware view evaluation pipeline renders an uncertainty map by integrating occlusion-aware covisibility, surfel-wise confidence, and back-face detection to quantify surface quality and completeness. (Bottom left) Guided by this dense uncertainty, the next-best-path (NBP) planner performs multi-step lookahead on a spatial topology, generating globally efficient trajectories that balance information gain and movement cost for active reconstruction.
Simulation Experiments
We conducted extensive experiments on 16 objects with different geometric and textural complexities
from the Google
Scanned Objects (GSO) dataset. The online reconstruction and offline refinement process (5x
speed) is as follows:
Reconstruction Process
Online Reconstruction: ObjSplat progressively reconstructs unknown objects by
jointly refining geometry and appearance while autonomously selecting informative viewpoints. Guided
by geometry-aware evaluation, it accurately identifies under-reconstructed and incomplete regions
and actively plans viewpoints.
Offline Refinement: We further perform offline refinement using keyframes collected
during the online reconstruction process, applying joint geometry–texture optimization to improve
reconstruction quality.
Reconstruction Results
For each object, we present 360° rendered views of the reconstructed model, including RGB (left) and
corresponding depth maps (right).
ObjSplat produces accurate object geometry with clear boundaries while maintaining photorealistic
appearance.
Compare with other methods
Quantitative Evaluation of Reconstruction Quality. We report visual quality (PSNR, SSIM, LPIPS) and geometric accuracy (Depth L1, Chamfer Distance, F-Score) on both training and novel test views.
Quantitative Analysis of Reconstruction Completeness and Efficiency. We report reconstruction completion ratio (CR), completion (CE), and efficiency metrics (MC, online/offline time) at three distinct phases.
Quantitative comparison of reconstruction progress over the number of views (top row) and path length (bottom row). We report novel view PSNR, Chamfer Distance (CD), F1-Score, and Completion Ratio across 16 objects. Shaded areas indicate standard deviation.
Visual comparison of reconstruction completeness and
exploration efficiency on the Mario object. Top
row: online Gaussian models with
camera trajectories and frustums. Metrics denote Surface Coverage (↑) | Path Length (↓). Middle
row: the uncertainty map at the final selected viewpoint. Bottom row: final meshes
extracted via TSDF fusion with zoomed-in details. Metrics denote Chamfer Distance (↓) | F-Score (↑).
Real-World Experiments
Reconstruction Process
To validate the effectiveness and practical usage of ObjSplat in
real-world scenarios, we deploy the system on a robotic arm and turntable platform. We show four
snapshots of the NBP planning execution. In each
step, the robot executes a path to a sub-goal with high estimated uncertainty (visualized in the
inset). The final column shows the complete trajectory, total path length, and the final PSNR (dB)
of the training views.
Reconstruction Results
We display the rendered images from the final Gaussian
surfels
(left) and the extracted surface meshes (right). Our method recovers high-fidelity
texture and geometry even for objects with complex topology and surface patterns.
We present 360° rendered views of the reconstructed Gaussian model, including RGB (left) and
corresponding depth maps (right).
ObjSplat recovers accurate object geometry and fine-grained details, such as the skin texture of the
Sika Deer, the intricate geometric carvings on the Fu Hao Owl Zun, and the rich patterns on the Pottery Figure.
The high-fidelity geometry and appearance reconstructed by ObjSplat allow reliable mesh extraction
for downstream applications. We visualize 360° rendered views of the extracted mesh model, including
RGB (left) and corresponding normal maps (right).
BibTeX
To be added soon.