The Curious Robot: Generalized Affordance Reasoning with Physics-based and LLM-enhanced Imagination

1Department of Mechanical Engineering, National University of Singapore, Singapore
2Department of Mechanical Engineering, University of Delaware, USA

*Indicates Equal Contribution

Video Presentation

Abstract

This paper presents an automatic affordance reasoning paradigm that enables robots to determine whether an object possesses a requested affordance, requiring only the functionality name as input. By leveraging Large Language Models (LLMs) for high-level reasoning and physics-based simulations for physical validation, the proposed approach emulates human-like cognitive processes to analyze and evaluate novel affordances. This framework translates functionality names into interaction-based definitions, simulates virtual scenarios, and predicts the interaction that the object affords. Extensive evaluations on synthetic and real datasets demonstrate that our method surpasses learning-based methods in classifying functionality and predicting functional poses across novel object classes. Real-world robotic experiments further validate its effectiveness in conceptualizing novel affordances and interfacing with unfamiliar objects in everyday scenarios.

Framework

Normal and Anomalous Representations

Given an unseen object model in a random pose and a novel affordance request, the algorithm first finds the object stable poses by stable pose imagination. The Imagination Analyzer analyzes the requested affordance and generates an imagination profile. The algorithm imagines the imagination profile with the object and loops for all stable poses. The Imagination Evaluator determines if the object possesses the requested affordance, if it is, the functional pose and feasible agent trajectories. If the object is functional, the robot manipulates a real agent following the imagined trajectory.

Example Output from LLM Modules

Experiment on real objects

Normal and Anomalous Representations