Action SpaceΒΆ

In the Meta-World benchmark, the agent must simultaneously solve multiple tasks that could be individually defined by their own Markov decision processes. As this is solved by current approaches using a single policy/model, it requires the action space for all tasks to have a constant size, hence sharing a common structure.

The action space of the Sawyer robot is a Box(-1.0, 1.0, (4,), float32). An action represents the Cartesian displacement dx, dy, and dz of the end-effector, and an additional action for gripper control.

For tasks that do not require the gripper, actions along those dimensions can be masked or ignored and set to a constant value that permanently closes the fingers.

Num

Action

Control Min

Control Max

Name (in XML file)

Joint

Unit

0

Displacement of the end-effector in x direction (dx)

-1

1

mocap

N/A

position (m)

1

Displacement of the end-effector in y direction (dy)

-1

1

mocap

N/A

position (m)

2

Displacement of the end-effector in z direction (dz)

-1

1

mocap

N/A

position (m)

3

Gripper adjustment (closing/opening)

-1

1

rightclaw, leftclaw

r_close, l_close

position (normalized)