eNET MB Agent

class or_suite.agents.rl.enet_mb.eNetMB(action_net, state_net, epLen, scaling, state_action_dim, alpha, flag)[source]

Uniform Discretization model-based algorithm algorithm implemented for enviroments with continuous states and actions using the metric induces by the l_inf norm

epLen: (int) number of steps per episode

scaling: (float) scaling parameter for confidence intervals

action_net: (list) of a discretization of action space

state_net: (list) of a discretization of the state space

state_action_dim: d_1 + d_2 dimensions of state and action space respectively

alpha: (float) parameter for prior on transition kernel

flag: (bool) for whether to do full step updates or not

__init__(action_net, state_net, epLen, scaling, state_action_dim, alpha, flag)[source]: Initialize self. See help(type(self)) for accurate signature.

get_num_arms()[source]: Returns the number of arms

pEst: Resets the agent by overwriting all of the estimates back to zero

pick_action(state, step)[source]

Select action according to a greedy policy

Parameters

state – int - current state
timestep – int - timestep within episode

Returns

action

Return type

int

update_obs(obs, action, reward, newObs, timestep, info)[source]: Add observation to records

update_policy(k)[source]: Update internal policy based upon records