Grid Search Agent
- class or_suite.agents.oil_discovery.grid_search.grid_searchAgent(epLen, dim=1)[source]
Agent that uses a bisection-method heuristic algorithm to the find location with the highest probability of discovering oil.
- update_config()
(UNIMPLEMENTED)
- update_obs(obs, action, reward, newObs, timestep, info)[source]
record reward of current midpoint or move bounds in direction of higher reward
- epLen
(int) number of time steps to run the experiment for
- dim
(int) dimension of metric space for agent and environment
- upper
(float list list) matrix containing upper bounds of agent at each step in dimension
- lower
(float list list) matrix contianing lower bounds of agent at each step in dimension
- perturb_estimates
(float list list) matrix containing estimated rewards from perturbation in each dimension
- midpoint_value
(float list) list containing midpoint of agent at each step
- dim_index
(int list) list looping through various dimensions during perturbation
- select_midpoint
(bool list) list recording whether to take midpoint or perturb at given step
- __init__(epLen, dim=1)[source]
- Parameters
epLen – (int) number of time steps to run the experiment for
dim – (int) dimension of metric space for agent and environment
- pick_action(state, step)[source]
If upper and lower bounds are updated based on perturbed values, move agent to midpoint. Else, perturb dimension by factor equal to half the distance from each bound to midpoint.
- update_obs(obs, action, reward, newObs, timestep, info)[source]
If no perturbations needed, update reward to be value at midpoint. Else, adjust upper or lower bound in the direction of higher reward as determined by the perturbation step. Agent loops across each dimension separately, and updates estimated midpoint after each loop.