Utilities
Miscellaneous utilities.
- class seals.util.AbsorbAfterDoneWrapper(env, absorb_reward=0.0, absorb_obs=None)[source]
Bases:
Wrapper
Transition into absorbing state instead of episode termination.
When the environment being wrapped returns done=True, we return an absorbing observation. This wrapper always returns done=False.
A convenient way to add absorbing states to environments like MountainCar.
- __init__(env, absorb_reward=0.0, absorb_obs=None)[source]
Initialize AbsorbAfterDoneWrapper.
- Parameters
env – The wrapped Env.
absorb_reward – The reward returned at the absorb state.
absorb_obs – The observation returned at the absorb state. If None, then repeat the final observation before absorb.
- step(action)[source]
Advance the environment by one step.
This wrapped step() always returns done=False.
After the first done is returned by the underlying Env, we enter an artificial absorb state.
In this artificial absorb state, we stop calling self.env.step(action) (i.e. the action argument is entirely ignored) and we return fixed values for obs, rew, done, and info. The values of obs and rew depend on initialization arguments. info is always an empty dictionary.
- class seals.util.AutoResetWrapper(env)[source]
Bases:
Wrapper
Hides done=True and auto-resets at the end of each episode.
- class seals.util.ObsCastWrapper(env, dtype)[source]
Bases:
Wrapper
Cast observations to specified dtype.
Some external environments return observations of a different type than the declared observation space. Where possible, this should be fixed upstream, but casting can be a viable workaround – especially when the returned observations are higher resolution than the observation space.
- seals.util.get_gym_max_episode_steps(env_name)[source]
Get the max_episode_steps attribute associated with a gym Spec.
- Return type
Optional
[int
]
- seals.util.grid_transition_fn(state, action, x_bounds=(-inf, inf), y_bounds=(-inf, inf))[source]
Returns transition of a deterministic gridworld.
Agent is bounded in the region limited by x_bounds and y_bounds, ends inclusive.
(0, 0) is interpreted to be top-left corner.
Actions: 0: Right 1: Down 2: Left 3: Up 4: Stay put
- seals.util.make_env_no_wrappers(env_name, **kwargs)[source]
Gym sometimes wraps envs in TimeLimit before returning from gym.make().
This helper method builds directly from spec to avoid this wrapper.
- Return type
Env