Hope Guardrail Agent

class or_suite.agents.resource_allocation.hope_guardrail.hopeguardrailAgent(epLen, env_config, scale)[source]

Hope Guardrail provides upper and lower thresholds on budget distribution calculated by solving the primal-dual paradigm of Eisenberg-Gale Convex Progam

generate_cvxpy_solver()[source]: Creates a generic solver to solve the offline resource allocation problem.

get_lower_upper_sol(init_size)[source]: Uses solver to get the lower and upper “guardrails” on budget distribution

get_expected_endowments(N=1000)[source]: MCM for estimating Expectation of type distribution using N realizations.

reset()[source]: resets bounds of agent to reflect upper and lower bounds of metric space.

update_config(env, config)[source]: Updates environment configuration dictionary.

update_obs(obs, action, reward, newObs, timestep, info)[source]: Add observation to records.

update_policy(k)[source]: Update internal policy based upon records.

pick_action(state, step)[source]: move agent to midpoint or perturb current dimension

num_types

Number of types

Type: int

num_resources

Number of commodities

Type: int

budget_remaining

Amount of each commodity the principal begins with.

Type: int

scale

Hyperparameter to be used in calculating threshold

Type: int

epLen

Number of locations (also the length of an episode).

Type: int

data

All data observed so far

Type: list

first_allocation_done

Flag that if false, gets upper and lower thresh

Type: bool

conf_const

Hyperparameter for confidence bound

Type: int

exp_endowments

Matrix containing expected proportion of endowments for location t

Type: list

stdev_endowments

Matrix describing variance of exp_endowments

Type: list

prob

CVXPY problem object

Type: cvxpy object

solver

Function that solves the problem given data

Type: lambda function

lower_sol

Matrix of lower threshold

Type: np.array

upper_sol

Matrix of upper threshold

Type: np.array

__init__(epLen, env_config, scale)[source]

Initialize hope_guardrail agent

Parameters

epLen – number of steps
env_config – parameters used in initialization of environment
scale – hyperparameter to be used in calculating threshold

generate_cvxpy_solver()[source]

Creates a generic solver to solve the offline resource allocation problem

Returns: prob - CVXPY problem object solver - function that solves the problem given data

get_expected_endowments(N=1000)[source]

Monte Carlo Method for estimating Expectation of type distribution using N realizations Only need to run this once to get expectations for all locations

Returns: rel_exp_endowments - matrix containing expected proportion of endowments for location t

get_lower_upper_sol(init_sizes)[source]

Uses solver to get the lower and upper “guardrails” on budget distribution

Parameters: init_sizes (list) – vector containing the number of each type at each location

pick_action(state, step)[source]

Returns allocation of resources based on calculated upper and lower solutions

Parameters

state – vector with first K entries denoting remaining budget, and remaining n entires denoting the number of people of each type that appear
step – timestep

Returns: matrix where each row is a K-dimensional vector denoting how: much of each commodity is given to each type

reset()[source]: Resets data matrix to be empty

update_config(env, config)[source]: Updates environment configuration dictionary

update_obs(obs, action, reward, newObs, timestep, info)[source]: Add observation to records

update_policy(k)[source]: Update internal policy based upon records