Agriculture
Augmented Reality
Computer Vision
Dialog Navigation
Dialog System
- Seamlessly Integrating Factual Information and Social Content with Persuasive Dialogue
- How to Build User Simulators to Train RL-based Dialog Systems
- A Network-based End-to-End Trainable Task-oriented Dialogue System
Dialog Systems
- Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
- ACE: A LLM-based Negotiation Coaching System
Guide Dog Robot
Human Robot Interaction
- Towards Robotic Companions: Understanding Handler-Guide DogInteractions for Informed Guide Dog Robot Design
- Reimagining RViz: Multidimensional Augmented Reality Robot Signal Design
- DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
- Descriptive and Prescriptive Visual Guidance to Improve Shared Situational Awareness in Human-Robot Teaming
- Seamlessly Integrating Factual Information and Social Content with Persuasive Dialogue
- Unwinding Rotations Improves User Comfort with Immersive Telepresence Robots
- Outracing champion Gran Turismo drivers with deep reinforcement learning
- Flight, Camera, Action! Using Natural Language and Mixed Reality to Control a Drone
- Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning
- Virtual Reality for Robots
- RMM: A Recursive Mental Model for Dialogue Navigation
- Improving Grounded Natural Language Understanding through Human-Robot Dialog
- RoomShift: Room-scale Dynamic Haptics for VR with Furniture-moving Swarm Robots
- That and There: Judging the Intent of Pointing Actions with Robotic Arms
- Communicating Robot Motion Intent with Augmented Reality
Human-Robot Interaction
- TeleMoMa A Modular and Versatile Teleoperation System for Mobile Manipulation
- Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning
Humanoid Robots
Imitation Learning
Knowledge-based Sequential Decision Making
- Visual Semantic Navigation Using Scene Priors
- Continual Learning of Knowledge Graph Embeddings
- Ethically Compliant Sequential Decision Making
- Semantic Linking Maps for Active Visual Object Search
- Commonsense Reasoning and Knowledge Acquisition to Guide Deep Learning on Robots
- Learning Pipelines with Limited Data and Domain Knowledge
LLM
- BadChain: Backdoor Chain-of-Thought Prompting for Large
- LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers
- True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
- Universal and Transferable Adversarial Attacks on Aligned Language Models
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack
- VIMA: General Robot Manipulation with Multimodal Prompts
Learning
- Learned Visual Navigation for Under-Canopy Agricultural Robots
- Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
- Practice Makes Perfect: Planning to Learn Skill Parameter Policies
- SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge
- VIMA: General Robot Manipulation with Multimodal Prompts
- NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities
- Eureka: Human-Level Reward Design via Coding Large Language Models
- Video Language Planning
- Learning to Navigate Sidewalks in Outdoor Environments
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models
- Robot Parkour Learning
- LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation
- Language Reward Modulation for Pretraining Reinforcement Learning
- Transforming a Quadruped into a Guide Robot for the Visually Impaired: Formalizing Wayfinding, Interaction Modeling, and Safety Mechanism
- Neural Volumetric Memory for Visual Locomotion Control
- Legs as Manipulator: Pushing Quadrupedal Agility Beyond Locomotion
- Embodied Amodal Recognition: Learning to Move to Perceive Objects
- MimicPlay: Long-Horizon Imitation Learning by Watching Human Play
- Guiding Pretraining in Reinforcement Learning with Large Language Models
- System Configuration and Navigation of a Guide Dog Robot: Toward Animal Guide Dog-Level Guiding Work
- DM2: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching
- Robotic Guide Dog: Leading a Human with Leash-Guided Hybrid Physical Interaction
- Deep Variational Reinforcement Learning for POMDPs
- Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
- Discovering Generalizable Skills via Automated Generation of Diverse Tasks
- A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
- Continual Learning of Knowledge Graph Embeddings
- Learning When to Quit: Meta-Reasoning for Motion Planning
- Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks
- Joint Inference of Reward Machines and Policies for Reinforcement Learning
- Human-like Planning for Reaching in Cluttered Environments
- Simultaneously Learning Transferable Symbols and Language Groundings from Perceptual Data for Instruction Following
- SAIL: Simulation-Informed Active In-the-Wild Learning
- Improving Grounded Natural Language Understanding through Human-Robot Dialog
- Proximal Policy Optimization Algorithms
- Imagination-Augmented Agents for Deep Reinforcement Learning
- Learning from Interventions using Hierarchical Policies for Safety Learning
- Deep Imitation Learning for Autonomous Driving in Generic Urban Scenarios with Enhanced Safety
- Learning to Teach in Cooperative Multiagent Reinforcement Learning
- Using Natural Language for Reward Shaping in Reinforcement Learning
- Agile Autonomous Driving using End-to-End Deep Imitation Learning
- Adversarial Actor-Critic Method for Task and Motion Planning Problems Using Planning Experience
- Learning Pipelines with Limited Data and Domain Knowledge
- Behavioral Cloning from Observation
Learning and Planning
- Using Commonsense Knowledge to Answer Why-Questions
- Learning Multi-Object Dynamics with Compositional Neural Radiance Fields
- Learning and Deploying Robust Locomotion Policies with Minimal Dynamics Randomization
- Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion
- Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
- Detect, Understand, Act: A Neuro-Symbolic Hierarchical Reinforcement Learning Framework (Extended Abstract)∗
- Florence: A New Foundation Model for Computer Vision
- Object Goal Navigation using Goal-Oriented Semantic Exploration
- Learning Feasibility to Imitate Demonstrators with Different Dynamics
- Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines
- Reward Machines for Vision-Based Robotic Manipulation
- Decision Transformer: Reinforcement Learning via Sequence Modeling
- Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
- Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
- ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations
- Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning
- Advice-Guided Reinforcement Learning in a non-Markovian Environment
- Spatial Intention Maps for Multi-Agent Mobile Manipulation
- What Does BERT with Vision Look At?
- A formal methods approach to interpretable reinforcement learning for robotic planning
Logical Reasoning
Mobile robots
NLP
- Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
- ACE: A LLM-based Negotiation Coaching System
Neurosymbolic
Open-World Generalization
Planning
- Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents
- Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and Reinforcement Learning
- Practice Makes Perfect: Planning to Learn Skill Parameter Policies
- SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge
- Video Language Planning
- Human-like Planning for Reaching in Cluttered Environments
- Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
- Reasoning About Physical Interactions with Object-Oriented Prediction and Planning
- SAIL: Simulation-Informed Active In-the-Wild Learning
- Adversarial Actor-Critic Method for Task and Motion Planning Problems Using Planning Experience
- Behavioral Cloning from Observation
Quadruped Robot
- Understanding Expectations for a Robotic Guide Dog for Visually Impaired People
- Towards Robotic Companions: Understanding Handler-Guide DogInteractions for Informed Guide Dog Robot Design
- Practice Makes Perfect: Planning to Learn Skill Parameter Policies
- Learning to See Physical Properties with Active Sensing Motor Policies
RL
Reinforcement Learning
- Leveraging Constraint Violation Signals For Action-Constrained Reinforcement Learning
- FlowPG: Action-constrained Policy Gradient with Normalizing Flows
- True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
- Learning to See Physical Properties with Active Sensing Motor Policies
Robotic Manipulation
Robotics
- Learned Visual Navigation for Under-Canopy Agricultural Robots
- Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Safety
Security
- Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems
- Characterizing Physical Adversarial Attacks on Robot Motion Planners
- BadChain: Backdoor Chain-of-Thought Prompting for Large
- Universal and Transferable Adversarial Attacks on Aligned Language Models
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack
State Estimation
Task and Motion Planning
- GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering
- Open-World Task and Motion Planning via Vision-Language Model Inferred Constraints
Task-Motion Planning
- LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation
- Code as Policies: Language Model Programs for Embodied Control
- Using Deep Learning to Bootstrap Abstractions for Hierarchical Robot Planning
- Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
- Pre-Trained Language Models for Interactive Decision-Making
- Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
- Online Replanning in Belief Space for Partially Observable Task and Motion Problems
- Elephants Don't Pack Groceries: Robot Task Planning for Low Entropy Belief States
- Planning with Learned Object Importance in Large Problem Instances using Graph Neural Networks
- Learning When to Quit: Meta-Reasoning for Motion Planning
- Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs
- Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
VLA
VLM
- SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation
- GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering
- Open-World Task and Motion Planning via Vision-Language Model Inferred Constraints