Inventory Control with Lead Times and Multiple Suppliers

class or_suite.envs.inventory_control_multiple_suppliers.multiple_suppliers_env.DualSourcingEnvironment(config)[source]

An environment with a variable number of suppliers, each with their own lead time and cost.

lead_times: The array of ints representing the lead times of each supplier.

starting_state: An int list containing enough indices for the sum of all the lead times, plus an additional index for the initial on-hand inventory.

action_space: (Gym.spaces MultiDiscrete) Actions must be the length of the number of suppliers. Each entry is an int corresponding to the order size.

observation_space: (Gym.spaces MultiDiscrete) The environment state must be the length of the of the sum of all lead times plus one. Each entry corresponds to the order that will soon be placed to a supplier. The last index is the current on-hand inventory.

neg_inventory: A bool that says whether the on-hand inventory can be negative or not.

__init__(config)[source]

Parameters: config – A dictionary containt the following parameters required to set up the environment: lead_times: array of ints representing the lead times of each supplier supplier_costs: array of ints representing the costs of each supplier demand_dist: The random number sampled from the given distribution to be used to calculate the demand hold_cost: The int holding cost. backorder_cost: The int backorder cost. epLen: The episode length max_order: The maximum value (int) that can be ordered from each supplier max_inventory: The maximum value (int) that can be held in inventory starting_state: An int list containing enough indices for the sum of all the lead times, plus an additional index for the initial on-hand inventory. neg_inventory: A bool that says whether the on-hand inventory can be negative or not.

render(mode='human')[source]

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reward(state)[source]

Reward is calculated in three components:

First component corresponds to the cost for ordering amounts from each supplier
Second component corresponds to paying a holding cost for extra inventory after demand arrives
Third component corresponds to a back order cost for unmet demand

seed(seed=None)[source]

Sets the numpy seed to the given value

step(action)[source]

Move one step in the environment.

Parameters

action – An int list of the amount to order from each supplier.

Returns

reward: A float representing the reward based on the action chosen.

newState: An int list representing the new state of the environment after the action.

done: A bool flag indicating the end of the episode.

info: A dictionary containing extra information about the step. This dictionary contains the int value of the demand during the previous step

Return type

float, int, bool, info