Envs¶
- class textworld.gym.envs.textworld.TextworldGymEnv(gamefiles, request_infos=None, max_episode_steps=None, action_space=None, observation_space=None, **kwargs)[source]¶
Bases:
TextworldBatchGymEnv
Environment for playing text-based games.
- Parameters
gamefiles (
List
[str
]) – Paths of every game composing the pool (*.ulx|*.z[1-8]
).request_infos (
Optional
[EnvInfos
]) –For customizing the information returned by this environment (see
textworld.EnvInfos
for the list of available information).Warning
Only supported for TextWorld games (i.e., that have a corresponding
*.json
file).max_episode_steps (
Optional
[int
]) – Number of steps allocated to play each game. Once exhausted, the game is done.action_space (
Optional
[Space
]) – The action space be used with OpenAI baselines. (seetextworld.gym.spaces.Word
).observation_space (
Optional
[Space
]) – The observation space be used with OpenAI baselines (seetextworld.gym.spaces.Word
).
- reset()[source]¶
Resets the text-based environment.
Resetting this environment means starting the next game in the pool.
- Return type
Tuple
[str
,Dict
[str
,Any
]]- Returns
A tuple (observation, info) where
observation: text observed in the initial state;
infos: additional information as requested.
- step(command)[source]¶
Runs a command in the text-based environment.
- Parameters
command – Text command to send to the game interpreter.
- Return type
Tuple
[str
,Dict
[str
,Any
]]- Returns
A tuple (observation, score, done, info) where
observation: text observed in the new state;
score: total number of points accumulated so far;
done: whether the game is finished or not;
infos: additional information as requested.
- action_space: Space[ActType]¶
- metadata: Dict[str, Any] = {'render.modes': ['human', 'ansi', 'text']}¶
- observation_space: Space[ObsType]¶
- class textworld.gym.envs.textworld_batch.TextworldBatchGymEnv(gamefiles, request_infos=None, batch_size=1, asynchronous=True, auto_reset=False, max_episode_steps=None, action_space=None, observation_space=None)[source]¶
Bases:
Env
Environment for playing text-based games in batch.
- Parameters
gamefiles (
List
[str
]) – Paths of every game composing the pool (*.ulx|*.z[1-8]|*.json
).request_infos (
Optional
[EnvInfos
]) –For customizing the information returned by this environment (see
textworld.EnvInfos
for the list of available information).Warning
Only supported for TextWorld games (i.e., that have a corresponding
*.json
file).batch_size (
int
) –If provided, it indicates the number of games to play at the same time. By default, a single game is played at once.
Warning
When
batch_size
is provided (even for batch_size=1),env.step
expects a list of commands as input and outputs a list of states.env.reset
also outputs a list of states.asynchronous (
bool
) – IfTrue
, wraps the environments in anAsyncBatchEnv
(which usesmultiprocessing
to run the environments in parallel). IfFalse
, wraps the environments in aSyncBatchEnv
. Default:True
.auto_reset (
bool
) – IfTrue
, each game independently resets once it is done (i.e., reset happens on the nextenv.step
call). Otherwise, once a game is done, subsequent calls toenv.step
won’t have any effects.max_episode_steps (
Optional
[int
]) – Number of steps allocated to play each game. Once exhausted, the game is done.action_space (
Optional
[Space
]) – The action space be used with OpenAI baselines. (seetextworld.gym.spaces.Word
).observation_space (
Optional
[Space
]) – The observation space be used with OpenAI baselines (seetextworld.gym.spaces.Word
).
- render(*args: Tuple[Any], **kwargs: Dict[str, Any]) Optional[Union[RenderFrame, List[RenderFrame]]] ¶
Compute the render frames as specified by render_mode attribute during initialization of the environment.
The set of supported modes varies per environment. (And some third-party environments may not support rendering at all.) By convention, if render_mode is:
None (default): no render is computed.
human: render return None. The environment is continuously rendered in the current display or terminal. Usually for human consumption.
single_rgb_array: return a single frame representing the current state of the environment. A frame is a numpy.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.
rgb_array: return a list of frames representing the states of the environment since the last reset. Each frame is a numpy.ndarray with shape (x, y, 3), as with single_rgb_array.
ansi: Return a list of strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).
Note
Rendering computations is performed internally even if you don’t call render(). To avoid this, you can set render_mode = None and, if the environment supports it, call render() specifying the argument ‘mode’.
Note
Make sure that your class’s metadata ‘render_modes’ key includes the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Return type
Union
[TypeVar
(RenderFrame
),List
[TypeVar
(RenderFrame
)],None
]
- reset()[source]¶
Resets the text-based environment.
Resetting this environment means starting the next game in the pool.
- Return type
Tuple
[List
[str
],Dict
[str
,List
[Any
]]]- Returns
A tuple (observations, infos) where
observation: text observed in the initial state for each game in the batch;
infos: additional information as requested for each game in the batch.
- seed(seed=None)[source]¶
Set the seed for this environment’s random generator(s).
This environment use a random generator to shuffle the order in which the games are played.
- Parameters
seed (
Optional
[int
]) – Number that will be used to seed the random generators.- Return type
List
[int
]- Returns
All the seeds used to set this environment’s random generator(s).
- skip(nb_games=1)[source]¶
Skip games.
- Parameters
nb_games (
int
) – Number of games to skip.- Return type
None
- step(commands)[source]¶
Runs a command in each text-based environment of the batch.
- Parameters
commands – Text command to send to the game interpreter.
- Return type
Tuple
[List
[str
],List
[float
],List
[bool
],Dict
[str
,List
[Any
]]]- Returns
A tuple (observations, scores, dones, infos) where
observations: text observed in the new state for each game in the batch;
scores: total number of points accumulated so far for each game in the batch;
dones: whether each game in the batch is finished or not;
infos: additional information as requested for each game in the batch.
- action_space: Space[ActType]¶
- metadata: Dict[str, Any] = {'render.modes': ['human', 'ansi', 'text']}¶
- observation_space: Space[ObsType]¶
- textworld.gym.envs.utils.shuffled_cycle(iterable, rng, nb_loops=-1)[source]¶
Yield each element of
iterable
one by one, then shuffle the elements and start yielding from the start. Stop afternb_loops
loops.- Parameters
iterable (
Iterable
[Any
]) – Iterable containing the elements to yield.rng (
RandomState
) – Random generator used to shuffle the elements after each loop.nb_loops (
int
) – Number of times to go through all the elements. If set to -1, loop an infinite number of times.
- Return type
Iterable
[Any
]