UMass Research Infrastructure:
Sensorimotor Development in Humans and Machines

Roderic Grupen
Andrew Barto
Carole Beal
Neil Berthier
Paul Cohen
Andrew Fagg
Rachel Keen

Departments of Computer Science and Psychology
University of Massachusetts Amherst


Grasping as a Haptic Control Problem

One class of controllers that we have been developing is aimed at the formation of stable grasps. Rather than starting with a detailed model of the object to be grasped (e.g., as derived from a vision system), the first step in our approach is to haptically explore the object to be grasped. At each contact with the object, the controller estimates the total force and torque applied to the object by the set of contacts. Given a simple model of the local object geometry, the controller computes movements of the fingers and arm that attempt to reduce the total force and torque. The power of this approach to grasp formation is that the controller can be assigned a variety of different physical resources, including finger tips, palms, multiple hands, and even ``virtual contacts,'' such as gravity.
  • Platt, Jr., R., Fagg, A. H., Grupen, R. A. (2004), Manipulation Gaits: Sequences of Grasp Control Tasks, to Appear in the International Conference on Robotics and Automation (ICRA'04)

  • Platt, Jr., R., Fagg, A. H., Grupen, R. A. (2003), Extending Fingertip Grasping to Whole Body Grasping, Proceedings of International Conference on Robotics and Automation (ICRA'03), pp. 2677-2682

  • Platt, Jr., R., Fagg, A. H., Grupen, R. A. (2002), Nullspace Composition of Control Laws for Grasping, Proceedings of the International Conference on Intelligent Robots and Systems (IROS'02), Electronically Published

Movies

Grasping Cylinders: Top Approach grasp_top.mov

grasp_top.mp4

grasp_top.avi

Grasping Cylinders: Side Approach
2 fingers form a virtual finger

grasp_peanutbutter.mov

grasp_peanutbutter.mp4

grasp_peanutbutter.avi

Whole Body Grasping wbg_move_left.mov

wbg_move_left.mp4

wbg_move_left.avi

Learning Grasp Location Affordances grasp_afford.mov

grasp_afford.mp4

grasp_afford.avi


Learning to Prospectively Select Grasps

The problem of grasping an object and moving it to another location has long been studied in robotics. One approach to this problem is to explicitly compute pick-and-place constraints and to perform a search within the constrained space. In contrast, humans are capable of robustly planning and executing grasps of objects about which their knowledge is incomplete. Furthermore, it appears that grasping strategies in humans are acquired incrementally as a function of experience with different objects.

Inspired by our infant development work, we have applied a reinforcement learning technique to the problem of discovering an appropriate sequence of grasp and place actions. Rather than starting with a model of which grasp was appropriate for a given final object configuration, the robot learned through interaction with the environment to select a grip in anticipation of how the grasped object was to be used in future actions. The behavior exhibited through the learning process by the robot demonstrated qualitative similarities to what one sees in the development of grip selection by children in a similar task.
  • McCarty, M. E. and Clifton, R. K. and Collard, R. R. (1999), Problem solving in infancy: The emergence of an action plan in Developmental Psychology, 35(4):1091-1101

  • McCarty, M. E., Clifton, R. K., & Collard, R. R. (2001). The beginnings of tool use by infants and toddlers in Infancy, 2:233-256.

  • Wheeler, D. S., Fagg, A. H., Grupen, R. A. (2002), Learning Prospective Pick and Place Behavior, Proceedings of the International Conference on Development and Learning (ICDL'02), Electronically Published

Staged learning

The robot is presented with a jar in one of two orientations. The task is to place the jar vertically with the top facing upwards. Through the course of interacting with the jar, the robot must discover 1) the sequence of actions that will accomplish the task and 2) the visual features that will allow the robot to select the appropriate sequence for a given situation. The robot is only told when the task is completed properly.

The following movies show individual trials during the learning process. In the first few trials, the robot is presented with the jar in only one orientation; this leads to the development of a "reflexive" response of reaching with the left arm (independent of the visual inputs). In the remaining trials, the jar is oriented randomly, requiring the robot to integrate the visual inputs into its decision making process.

Left Presentations Only: Prior to Learning applesauce3-easy-other-final.mov

applesauce3-easy-other-final.mp4

applesauce3-easy-other-final.avi

Left Presentation Only: Strategy After Learning applesauce1-easy-final.mov

applesauce1-easy-final.mp4

applesauce1-easy-final.avi

Both Orientations Presented: Before Learning applesauce4-hard-late-final.mov

applesauce4-hard-late-final.mp4

applesauce4-hard-late-final.avi

Both Orientations Presented: During Learning applesauce8-hard-early-final.mov

applesauce8-hard-early-final.mp4

applesauce8-hard-early-final.avi

Both Orientations Presented: After Learning applesauce7-hard-optimal-final.mov

applesauce7-hard-optimal-final.mp4

applesauce7-hard-optimal-final.avi


Learning Task Sequences from Demonstration

The remote teleoperation of robots is one of the dominant modes of robot control in applications involving hazardous environments, including space. Here, a user is equipped with an interface that conveys the sensory information being collected by the robot and allows the user to command the robot's actions. The difficulty with this form of interface is the degree of fatigue that is experienced by the user, often within a short period of time. To alleviate this problem, we are working with our colleagues at the NASA Johnson Space Center to develop user interfaces that anticipate the actions of the user, allowing the robot to aid in the partial performance of the task, or even to learn how to perform entire tasks autonomously.

Our approach is to use our automatic control techniques to aid in the recognition of the user's actions. Prior to the user demonstration, the control system enumerates the different grasping actions that can be used for each object in the workspace (essentially, the robot "imagines" what it would feel like to pick up every object). The movements produced by the user are then compared against each of these imagined actions. The one action that best matches the user-driven movement is considered to be the explanation of that movement. Using this technique, we are able to recognize entire sequences of actions.

Movies

Demonstration of a sequence by a user through a teleoperation interface. In this example, the extracted sequence is: pick up the blue ball; place it on the pink target, pick up the yellow ball, and place it on the orange target. sequence_learn_v2_demo.mov

sequence_learn_v2_demo.mp4

sequence_learn_v2_demo.avi

sequence_learn_v2_demo_small.avi

Automated replay of the same action sequence in a novel situation. Note that the movements are smoother and are executed more quickly than when the user is in control. sequence_learn_v2_D.mov

sequence_learn_v2_D.mp4

sequence_learn_v2_D.avi

sequence_learn_v2_D_small.avi


Last modified: Fri Mar 26 10:51:00 2004