Discrete QL Agent

class or_suite.agents.rl.discrete_ql.DiscreteQl(action_space, observation_space, epLen, scaling)[source]

Q-Learning algorithm implemented for enviroments with discrete states and actions using the metric induces by the l_inf norm

TODO: Documentation

matrix_dim: (tuple) a concatenation of epLen, state_size, and action_size used to create the estimate arrays of the appropriate size

num_visits: (list) The number of times that each episode, state, action tuple has been visited

__init__(action_space, observation_space, epLen, scaling)[source]: Initialize self. See help(type(self)) for accurate signature.

pick_action(state, step)[source]

Select action according to a greedy policy

Parameters

Returns

action

Return type

list

update_config(env, config)[source]: Update agent information based on the config__file

update_obs(obs, action, reward, newObs, timestep, info)[source]

Add observation to records

Parameters

update_parameters(param)[source]: Update the scaling parameter. :param param: (float) The new scaling value to use