Spaces

exception textworld.gym.spaces.text_spaces.VocabularyHasDuplicateTokens[source]

Bases: ValueError

class textworld.gym.spaces.text_spaces.Char(max_length, vocab=None, extra_vocab=[])[source]

Bases: gym.spaces.multi_discrete.MultiDiscrete

Character observation/action space

This space consists of a series of gym.spaces.Discrete objects all with the same parameters. Each gym.spaces.Discrete can take integer values between 0 and len(self.vocab).

Notes

The following special token will be prepended (if needed) to the vocabulary: # : Padding token

Parameters:
  • max_length (int) – Maximum number of characters in a text.
  • vocab (list of char, optional) – Vocabulary defining this space. It shouldn’t contain any duplicate characters. If not provided, the vocabulary will consists in characters [a-z0-9], punctuations [” “, “-“, “’”] and padding ‘#’.
  • extra_vocab (list of char, optional) – Additional tokens to add to the vocabulary.
filter_unknown(text)[source]

Strip out all characters not in the vocabulary.

tokenize(text, padding=False)[source]

Tokenize characters found in the vocabulary.

Note: text will be padded up to self.max_length.

class textworld.gym.spaces.text_spaces.Word(max_length, vocab)[source]

Bases: gym.spaces.multi_discrete.MultiDiscrete

Word observation/action space

This space consists of a series of gym.spaces.Discrete objects all with the same parameters. Each gym.spaces.Discrete can take integer values between 0 and len(self.vocab).

Notes

The following special tokens will be prepended (if needed) to the vocabulary: <PAD> : Padding <UNK> : Unknown word <S> : Beginning of sentence </S> : End of sentence

Parameters:
  • max_length (int) – Maximum number of words in a text.
  • vocab (list of strings) – Vocabulary defining this space. It shouldn’t contain any duplicate words.
tokenize(text, padding=False)[source]

Tokenize words found in the vocabulary.

Note: text will be padded up to self.max_length.