This chapter highlights the gap between human and robot manipulation skills and the societal need for adaptable, service-oriented robots. It discusses how demographic changes and the rise of collaborative robotics drive interest in more capable machines. Drawing on neuroscience and biomimicry, the chapter introduces a framework for contact-driven manipulation that integrates action-phase controllers, contact event detection, sensor fusion, and adaptive control. The approach is validated through experiments in unstructured scenarios and sets the stage for the thesis, which covers topics from human-inspired control to system integration.
Chapter 2: Human sensorimotor control of manipulation
This chapter reviews neuroscience studies on human grasping and manipulation, providing the biological inspiration for the contact-based robotic manipulation framework developed in the thesis. It details experimental methodologies, such as instrumented object grasping, and analyzes how humans adapt grip force in response to unexpected changes in friction, weight, and shape. Key findings include the identification of action-phase controllers, the importance of contact events, sensor fusion, contact event prediction, and corrective actions. These elements are mapped to the building blocks of the robotic framework, emphasizing the need for reactive, sensor-driven control strategies that mimic human adaptability and robustness in unstructured environments.
Human corrective movements preliminary experiments
Building on the neuroscience insights, this chapter introduces the manipulation primitives paradigm—a vocabulary of atomic, sensor-based actions (primitives) that can be composed to perform complex manipulation tasks. Each primitive is a parametrizable, reactive controller (e.g., grasp, transport, place, release, slide, explore, unscrew) designed for embodiment independence. The chapter details the implementation of these primitives, their control and sensor requirements, and their integration into higher-level tasks using a graph-based task description. Experimental validation demonstrates the robustness and adaptability of sensor-based primitives, particularly in uncertain or unstructured scenarios, and highlights the benefits of reactive corrections inspired by human behavior.
Robust grasping primitive
Empty the box: Contact-based blind grasping of unknown objects (2012 version)
Empty the box: Contact-based blind grasping of unknown objects (2013 version)
Dual arm sensor-based controller for the cap unscrewing task
This chapter presents a sensor fusion framework for contact detection and localization, addressing the challenge of integrating heterogeneous sensory data (tactile, force-torque, proprioception, vision, simulation) into a unified hypothesis space. The framework generates, fuses, and condenses contact hypotheses using probabilistic methods, enabling robust detection of contact events even in the presence of sensor noise or partial observability. The system supports regular, support, and null hypotheses, allowing the integration of predictions and contextual information. Experimental results on multiple robotic platforms validate the approach, showing improved detection rates and localization accuracy through multi-modal fusion.
Multi-sensor and prediction fusion for contact detection and localization
Contact hypothesis generation through robot and object motion from point clouds
This chapter discusses the implementation of a contact event prediction engine using dynamic simulation. The simulator (OpenRAVE with ODE physics) models the robot, environment, and sensor feedback to predict when and where contact events should occur during manipulation. The chapter details the selection and integration of simulation components, the challenges of achieving real-time, high-fidelity predictions, and the validation of simulation accuracy through comparative experiments with real robot executions. The simulation is used both as a surrogate for hardware and as a prediction engine to trigger corrective actions when mismatches between predicted and actual contact events are detected.
Simulation of robot dynamics for grasping and manipulation tasks
A modular, four-layered software architecture is described, supporting the integration of services, primitives, and tasks in both real and simulated environments. The architecture leverages ROS for communication and enforces a clear separation between hardware interfaces, low-level services, mid-level primitives, and high-level tasks. Built-in pipelines for visual perception, contact detection, and grasp planning are provided, along with tools for automatic code generation and configuration. The architecture enables portability across platforms and supports complex, multi-step manipulation tasks through the composition of primitives and perceptual actions.
Amazon Picking Challenge 2015: UJI Team shelf exploration
To enable hardware-independent task execution, this chapter introduces an abstraction mechanism that decouples task descriptions from specific robot embodiments. Abstract primitives and events are defined, and a translation process maps these to embodiment-specific controllers and sensor events via a factory pattern. The approach allows the same high-level task (e.g., pick-and-place) to be executed on different robots with varying kinematics and sensor capabilities, as demonstrated experimentally. The abstraction framework supports knowledge transfer, plan sharing, and the integration of learning-based or simulation-based skill acquisition.
Embodiment independent manipulation through action abstraction
Inspired by the ventral visual stream in primates, this chapter implements a hierarchical object recognition system for robots. The system performs shape classification (using features like contour curvedness), object recognition (using geometric, color, and weight features), and supports incremental learning. The approach integrates dorsal stream information (e.g., affordances, SOS/AOS activations) to enhance recognition reliability. Experiments validate the system's ability to classify and recognize objects in a biologically plausible, confidence-graded manner, and highlight the benefits of multi-modal integration and continuous learning.
Mind the gap - robotic grasping under incomplete observation
The thesis synthesizes neuroscience-inspired principles, sensor-based control, sensor fusion, simulation-based prediction, and software architecture into a comprehensive framework for robust, adaptive robotic manipulation in unstructured environments. Key contributions include the manipulation primitives paradigm, multi-modal contact detection, simulation-based prediction, hardware abstraction, and biologically inspired object recognition. The chapter outlines open questions and future directions, such as incorporating learning at multiple levels, improving sensor fusion and prediction, automating high-level task planning, and further validating the system in complex, real-world scenarios.