⚓Observation Space

General

The observation space provided to the agent involves

All sensor attached to the vehicle instance
1. Basically the RoarRLSimEnv will take in an instance of RoarPyActor and the observation space of the environment will be a superset of that RoarPyActor's get_gym_observation_spec()
2. In Roar's internal RL code, we added the following sensors to the actor
  1. local coordinate velocimeter
  2. gyroscope (angular velocity sensor)
Information of waypoints relative to the current position of the vehicle.
1. We will discuss this more in the next section

Waypoint Information Observation

Instead of inputting an entire occupancy map into the network, we directly feed numerical information about waypoints near the vehicle as the observation provided to the agent. This is how it works:

During initialization of the environment we specify an array of relative distances (that is an array of floating point values) we want to trace for waypoint information observations
Then in each step we perform trace one by one, and storing them inside waypoints_information key in the final observation dict.

How we trace those relative distances is described below

Remember in the Reward Function page we described how we would treat the list of waypoints provided to us as a circular path and trace in each step the closest point $\hat{\vec{p}}_v$ on the path relative to the position of the vehicle $\vec{p}_v$ .

Basically for each relative distance $d$ , we trace along the circular path forward $d$ meters and find an interpolated waypoint $\vec{w}$ , which contains the global position of the interpolated waypoint, the rotation of that interpolated waypoint, and the road width of that waypoint. Then we concatenate:

Relative 2d(x,y) position of the waypoint in the vehicle's local coordinate
Yaw of that waypoint in the vehicle's local coordinate
Road width of the interpolated waypoint

into a vector as the observation. We repeat this for every relative distances defines during environment initialization and stack all vectors into a matrix. Then that matrix is fed into the agent as a way to understand the road information around the vehicle.

PreviousReward Function NextAction Space

Last updated 1 year ago