Categorization and Abstraction in Sensorimotor Schema

Next: Learning Physical Schemas Up: Project Description Previous: Example: Learning to

Categorization and Abstraction in Sensorimotor Schema

The supervisory policy illustrated in Figure 1 is a procedural knowledge structure describing the characteristic behavior of agent resources engaged in certain tasks in a variety of contexts. It is also a policy for interacting with a variety of possibly unknown plants. The term ``plant'' refers to the dynamical system formed by a particular combination of controllable resources and environmental constraints.

Figure 2 presents a closed-loop control scheme used to form stable grasps on the basis of instantaneous contact information. The plant consists of a -contact grasp configuration on an unknown object geometry. Grasping is formulated as a robust control problem in which variations in object geometry and contact friction properties are modeled as disturbances. The grasp controller is designed to be robust to these forms of variations in the plant.

, is really a combination of two independent controllers and (see [105][106][103][104] for the definitions) that yield statically stable contact configurations for and contact grasps on any convex, regular polygon. Moreover, the history of errors in the and controllers reveals some of the parameters of the plant and can be used to generalize the grasping behavior to other objects. A model of the interaction between and controllers called the preimage can be constructed by plotting the system trajectory on the error plane as a grasp unfolds. Furthermore, grasp quality can be backed-up from equilibrium grasp configurations into precursor states and used to predict the eventual quality of grasp on an unknown object.

For example, consider the coefficient of friction, , required to stabilize the equilibrium grasp configuration as the grasp quality measure. The preimage can be consulted to determine whether the ``worst-case'' expected exceeds the contact friction specified for the task, . A simple but effective policy for compensation involves superimposing a small, random control action onto the controller. When the state of the system enters the preimage of a suitable attractor (), the random control action is deactivated and achieves this equilibrium grasp. This procedure is represented schematically in Figure 3.

When multiple candidate objects are possible, a composite preimage is defined: where , which predicts the worst-case coefficient of friction required over all candidate objects. Figure 4 illustrates a typical grasp on a trapezoid for which . The left column depicts the composite preimage for the objects in the right column, given the current number of contacts. The initial grasp configuration has two contacts (). At , the observed state is inaccessible in the preimages of the triangles and so they are eliminated from the list of candidate objects.

The preimage can be used to trigger the addition and removal of additional resources. Grasp configurations that are expected to lead to unsatisfactory grasps for all candidate plants cause resources (additional contacts) to be introduced. Consequently, at , the controller allocates a third contact (shown in white in Figure 4), at . If all contact configurations satisfy grasp friction constraints, then one of the contacts can be removed. At , the controller removes contact 0 at (marked with a circle in Figure 4), and subsequently eliminates the pentagon, hexagon, leaf, and the bulb object interpretations. The set of candidate objects now consists of the square and the trapezoid; the controller converges at to a suitable grasp configuration for either of these, requiring a friction coefficient .

This demonstration illustrates that evolving state of a control task can be a powerful perceptual cue regarding the ultimate quality of the motor policy. The preimage delineates regions where the system behavior is dependent on the particular plant thus affecting both recognition and control objectives. The preimage is an abstraction whose cost is amortized over the lifetime of more efficient grasping tasks.

Next: Learning Physical Schemas Up: Project Description Previous: Example: Learning to

grupen@tigger.cs.umass.edu
Wed Apr 16 00:53:15 EDT 1997