Resource Allocation

Sequential Resource Allocation Problem for n locations with K commodities.

A ResourceAllocationEnvironment where agent iterates through locations and receives a reward of Nash Social Welfare based on the resources it allocates, conditioned that allocation is within budget

class or_suite.envs.resource_allocation.resource_allocation.ResourceAllocationEnvironment(config={'K': 2, 'MAX_VAL': 1000, 'from_data': False, 'init_budget': <function <lambda>>, 'num_rounds': 10, 'type_dist': <function <lambda>>, 'utility_function': <function <lambda>>, 'weight_matrix': array([[1. , 2. ],        [0.3, 9. ],        [1. , 1. ]])})[source]

Custom Environment that follows gym interface.

This is a simple resource allocation environment modeling a fair online allocation

Methods:

get_config() : Returns the config dictionary used to initialize the environment. reset() : Resets environment to original starting state and timestep to 0 step(action) : Takes in allocation as action subtracts from budget, calculates reward, and updates action space render(mode) : (UNIMPLEMENTED) Renders the environment in the mode passed in; ‘human’ is the only mode currently supported. close() : (UNIMPLEMENTED) Closes the window where the rendering is being drawn.

weight_matrix

Weights predefining the commodity needs for each type, every row is a type vector.

Type

list

num_types

Number of types

Type

int

num_commodities

Number of commodities

Type

int

epLen

Number of locations (also the length of an episode).

Type

int

budget

Amount of each commodity the principal begins with.

Type

int

type_dist

Function determining the number of people of each type at a location.

Type

lambda function

utility_function

Utility function, given an allocation x and a type theta, u(x,theta) is how good the fit is.

Type

lambda function

starting_state

Tuple (represented as list concat) of initial budget and type distribution.

Type

np.array

timestep

Step that is executed in an episode of an iteration.

Type

int

action_space

(Gym.spaces Box) Action space represents the K x n allocation matrix.

observation_space

(Gym.spaces Box) The first K entries to the observation space is remaining budget, with the remaining spaces filled by the number of each type at each location.

__init__(config={'K': 2, 'MAX_VAL': 1000, 'from_data': False, 'init_budget': <function <lambda>>, 'num_rounds': 10, 'type_dist': <function <lambda>>, 'utility_function': <function <lambda>>, 'weight_matrix': array([[1. , 2. ],        [0.3, 9. ],        [1. , 1. ]])})[source]

Inits RideshareGraphEnvironment with the given configuration.

Parameters

config – A dictionary containing the initial configuration of the resource allocation environment.

close()[source]

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

get_config()[source]

Returns: the environment config (dict).

render(mode='console')[source]

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

  • human: render to the current display or terminal and return nothing. Usually for human consumption.

  • rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.

  • ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes

the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Parameters

mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):
if mode == ‘rgb_array’:

return np.array(…) # return RGB frame suitable for video

elif mode == ‘human’:

… # pop up a window and render

else:

super(MyEnv, self).render(mode=mode) # just raise an exception

reset()[source]

Requires: the observation must be a numpy array Returns: np.array

step(action)[source]

Move one step in the environment.

Parameters

action – A matrix; the chosen action (each row how much to allocate to prev location).

Returns

reward (double) : the reward. newState (int): the new state. done (bool) : the flag for end of the episode. info (dict) : any additional information.

Return type

double, int, 0/1, dict