Fixed Threshold Agent

class or_suite.agents.resource_allocation.fixed_threshold.fixedThresholdAgent(epLen, env_config)[source]

Fixed Guardrail provides lower thresholds on budget distribution calculated by solving the primal-dual paradigm of Eisenberg-Gale Convex Progam

generate_cvxpy_solver()[source]: Creates a generic solver to solve the offline resource allocation problem.

get_lower_upper_sol(init_size)[source]: Uses solver to get the lower threshold on budget distribution

get_expected_endowments(N=1000)[source]: MCM for estimating Expectation of type distribution using N realizations.

reset()[source]: resets bounds of agent to reflect upper and lower bounds of metric space.

update_config(env, config)[source]: Updates environment configuration dictionary.

update_obs(obs, action, reward, newObs, timestep, info)[source]: Add observation to records.

update_policy(k)[source]: Update internal policy based upon records.

pick_action(state, step)[source]: move agent to midpoint or perturb current dimension

num_types

Number of types

Type: int

num_resources

Number of commodities

Type: int

budget_remaining

Amount of each commodity the principal begins with.

Type: int

scale

Hyperparameter to be used in calculating threshold

Type: int

epLen

Number of locations (also the length of an episode).

Type: int

data

All data observed so far

Type: list

first_allocation_done

Flag that if false, gets upper and lower thresh

Type: bool

conf_const

Hyperparameter for confidence bound

Type: int

exp_endowments

Matrix containing expected proportion of endowments for location t

Type: list

stdev_endowments

Matrix describing variance of exp_endowments

Type: list

prob

CVXPY problem object

Type: cvxpy object

solver

Function that solves the problem given data

Type: lambda function

lower_sol

Matrix of lower threshold

Type: np.array

upper_sol

Matrix of upper threshold

Type: np.array

__init__(epLen, env_config)[source]

Initialize fixed_threshold agent

Parameters

epLen – number of steps
env_config – parameters used in initialization of environment
scale – hyperparameter to be used in calculating threshold

generate_cvxpy_solver()[source]

Creates a generic solver to solve the offline resource allocation problem

Returns: prob - CVXPY problem object solver - function that solves the problem given data

get_expected_endowments(N=1000)[source]

Monte Carlo Method for estimating Expectation of type distribution using N realizations Only need to run this once to get expectations for all locations

Returns: rel_exp_endowments - matrix containing expected proportion of endowments for location t

get_lower_upper_sol(init_sizes)[source]: Uses solver to get the lower threshold

pick_action(state, step)[source]

Returns allocation of resources based on calculated upper and lower solutions

Parameters

state – vector with first K entries denoting remaining budget, and remaining n entires denoting the number of people of each type that appears
step – timestep

Returns

matrix where each row is a K-dimensional vector denoting how much of each commodity is given to each type

reset()[source]: Resets data matrix to be empty

update_config(env, config)[source]: Updates environment configuration dictionary

update_obs(obs, action, reward, newObs, timestep, info)[source]: Add observation to records

update_policy(k)[source]: Update internal policy based upon records