Finite Armed Bandit
- class or_suite.envs.finite_armed_bandit.finite_bandit.FiniteBanditEnvironment(config={'arm_means': array([0.1, 0.7, 0.2, 1.]), 'epLen': 5})[source]
Custom Environment that follows gym interface.
This is a simple env for a finite armed bandit problem.
- __init__(config={'arm_means': array([0.1, 0.7, 0.2, 1.]), 'epLen': 5})[source]
For a more detailed description of each parameter, see the readme file
- Parameters
epLen – The number of time steps.
arm_means – The means for each of the arms.
- close()[source]
Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
- render(mode='console')[source]
Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
Note
- Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Parameters
mode (str) – the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
… # pop up a window and render
- else:
super(MyEnv, self).render(mode=mode) # just raise an exception