We introduce a novel diffusion model incorporating the new concept of negative prompt guidance learning to tackle the task of 6-DoF grasp detection in cluttered point clouds.
We introduce HabiCrowd, a new dataset and benchmark for crowd-aware visual navigation that surpasses other benchmarks in terms of human diversity and computational utilization.
We address the task of language-driven affordance-pose detection in 3D point clouds. Our method simultaneously detect open-vocabulary affordances and generate affordance-specific 6-DoF poses.
Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation Tuan Vo,
Minh Nhat Vu,
Baoru Huang,
Tien Toan Nguyen,
Ngan Le,
Thieu Vo,
Anh Nguyen IEEE International Conference on Robotics and Automation (ICRA), 2024 arXiv |
Code
We introduce a new open-vocabulary affordance detection method using knowledge distillation and text-point correlation.
We introduce Language-Driven Scene Synthesis task, which involves the leverage of human-input text prompts to generate physically plausible and semantically reasonable objects.