Resource Allocation Discrete

Discrete Sequential Resource Allocation Problem for n locations with K commodities.

Currently reward is Nash Social Welfare but in the future will integrate more options to determine a fair allocation.

class or_suite.envs.resource_allocation.resource_allocation_discrete.DiscreteResourceAllocationEnvironment(config={'K': 2, 'MAX_VAL': 1000, 'from_data': False, 'init_budget': <function <lambda>>, 'num_rounds': 10, 'type_dist': <function <lambda>>, 'utility_function': <function <lambda>>, 'weight_matrix': array([[1. , 2. ], [0.3, 9. ], [1. , 1. ]])})[source]

Custom Environment that follows gym interface.

__init__(config={'K': 2, 'MAX_VAL': 1000, 'from_data': False, 'init_budget': <function <lambda>>, 'num_rounds': 10, 'type_dist': <function <lambda>>, 'utility_function': <function <lambda>>, 'weight_matrix': array([[1. , 2. ], [0.3, 9. ], [1. , 1. ]])})[source]

Initializes the Discrete Sequential Resource Allocation Environment.

Parameters

weight_matrix – Weights predefining the commodity needs for each type, every row is a type vector.
K – Number of commodities.
num_rounds – Number of agents (also the length of an episode).
init_budget – Amount of each commodity the principal begins with.
type_dist – Function determining the number of people of each type at a location.
u – Utility function, given an allocation x and a type theta, u(x,theta) is how good the fit is.

close()[source]

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

render(mode='console')[source]

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Parameters: mode (str) – the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reset()[source]: Important: the observation must be a numpy array Returns: np.array

step(action)[source]

Move one step in the environment.

Parameters

action – A matrix; the chosen action (each row how much to allocate to prev location).

Returns

reward: double; the reward.

newState: int; the new state.

done: 0/1; the flag for end of the episode.

info: dict; any additional information.

Return type

double, int, 0/1, dict