Abstract: During the deep learning era, innovations in remote sensing (RS) vision models primarily focused on optimizing network architectures for specific tasks and conducting end-to-end training.
Abstract: For long-horizon multi-task robotic manipulation, hierarchical approaches provide an effective way to combine high-level language-based task planning with low-level vision-language based sub ...