Abstract: Light Detection and Ranging (LiDAR) point cloud semantic segmentation is an essential task for autonomous vehicles, enabling precise environment perception for safe navigation. Existing ...
Abstract: This paper introduces GS-Pose, a unified framework for localizing and estimating the 6D pose of novel objects. GS-Pose begins with a set of posed RGB images of a previously unseen object and ...
Previous research has investigated the application of Multimodal Large Language Models (MLLMs) in understanding 3D scenes by interpreting them as videos. These approaches generally depend on ...