Action SpaceΒΆ
In the Meta-World benchmark, the agent must simultaneously solve multiple tasks that could be individually defined by their own Markov decision processes. As this is solved by current approaches using a single policy/model, it requires the action space for all tasks to have a constant size, hence sharing a common structure.
The action space of the Sawyer robot is a Box(-1.0, 1.0, (4,), float32)
.
An action represents the Cartesian displacement dx
, dy
, and dz
of the end-effector, and an additional action for gripper control.
For tasks that do not require the gripper, actions along those dimensions can be masked or ignored and set to a constant value that permanently closes the fingers.
Num |
Action |
Control Min |
Control Max |
Name (in XML file) |
Joint |
Unit |
---|---|---|---|---|---|---|
0 |
Displacement of the end-effector in x direction (dx) |
-1 |
1 |
mocap |
N/A |
position (m) |
1 |
Displacement of the end-effector in y direction (dy) |
-1 |
1 |
mocap |
N/A |
position (m) |
2 |
Displacement of the end-effector in z direction (dz) |
-1 |
1 |
mocap |
N/A |
position (m) |
3 |
Gripper adjustment (closing/opening) |
-1 |
1 |
rightclaw, leftclaw |
r_close, l_close |
position (normalized) |