Adaptive Discretization MB Agent

class or_suite.agents.rl.ada_mb.AdaptiveDiscretizationMB(epLen, scaling, alpha, split_threshold, inherit_flag, flag, state_dim, action_dim)[source]

Adaptive model-based Q-Learning algorithm implemented for enviroments with continuous states and actions using the metric induces by the l_inf norm.

epLen

(int) number of steps per episode

scaling

(float) scaling parameter for confidence intervals

inherit_flag

(bool) boolean of whether to inherit estimates

dim

(int) dimension of R^d the state_action space is represented in

__init__(epLen, scaling, alpha, split_threshold, inherit_flag, flag, state_dim, action_dim)[source]
Parameters
  • epLen – number of steps per episode

  • numIters – total number of iterations

  • scaling – scaling parameter for UCB term

  • alpha – parameter to add a prior to the transition kernels

  • inherit_flag – boolean on whether to inherit when making children nodes

  • flag – boolean of full (true) or one-step updates (false)

pick_action(state, timestep)[source]

Select action according to a greedy policy.

Parameters
  • state – int - current state

  • timestep – int - timestep within episode

Returns

action

Return type

int

update_config(env, config)[source]

Update agent information based on the config__file.

update_obs(obs, action, reward, newObs, timestep, info)[source]

Add observation to records.

update_policy(k)[source]

Update internal policy based upon records.