Ridesharing
Implementation of an RL environment in a discrete graph space.
A ridesharing environment over a simple graph. An agent interacts through the environment by choosing a non-zero node to service a given rideshare request.
- class or_suite.envs.ridesharing.rideshare_graph.RideshareGraphEnvironment(config={'cost': 1, 'd_threshold': 20, 'edges': [(0, 1, {'travel_time': 1}), (0, 2, {'travel_time': 100}), (0, 3, {'travel_time': 10}), (1, 2, {'travel_time': 20}), (1, 3, {'travel_time': 1}), (2, 3, {'travel_time': 70})], 'epLen': 5, 'fare': 3, 'gamma': 1, 'num_cars': 3, 'request_dist': <function <lambda>>, 'reward': <function <lambda>>, 'reward_denied': <function <lambda>>, 'reward_fail': <function <lambda>>, 'starting_state': [1, 1, 1, 0], 'travel_time': True, 'velocity': 3})[source]
Custom Rideshare Graph Environment that follows gym interface.
This is a simple env where the requests are uniformly distributed across nodes.
- config
A dictionary containing the initial configuration of the rideshare graph environment.
- epLen
An integer representing the total number of time steps.
- graph
An object containing nodes and edges; each edge has a travel time.
- num_nodes
An integer count of the number of nodes in the graph.
- starting_state
A vector representing the initial state of the environment; the first K elements represent the number of cars at each node, and the final 2 elements represent the current request that needs to be satisfied, i.e. node i to node j.
- state
A vector representing the state of the environment; the first K elements represent the number of cars at each node, and the final 2 elements represent the current request that needs to be satisfied, i.e. node i to node j.
- timestep
An integer representing the current timestep of the model.
- num_cars
An integer representing the number of cars in the model.
- lengths
A 2-dimensional symmetric array containing the distances between each pair of nodes.
- request_dist
A vector consisting of the distribution used for selecting nodes when generating requests.
- reward
A lambda function to generate the reward.
- reward_fail
A lambda function to generate the reward when the RL agent fails; i.e. when a request is not satisfied.
- action_space
A discrete set of values the action can have; in this case the action space is an integer within {0..K-1}.
- observation_space
A multidiscrete that represents all possible values of the state; i.e. all possible values for the amount of cars at each node and all possible nodes for any request.
- __init__(config={'cost': 1, 'd_threshold': 20, 'edges': [(0, 1, {'travel_time': 1}), (0, 2, {'travel_time': 100}), (0, 3, {'travel_time': 10}), (1, 2, {'travel_time': 20}), (1, 3, {'travel_time': 1}), (2, 3, {'travel_time': 70})], 'epLen': 5, 'fare': 3, 'gamma': 1, 'num_cars': 3, 'request_dist': <function <lambda>>, 'reward': <function <lambda>>, 'reward_denied': <function <lambda>>, 'reward_fail': <function <lambda>>, 'starting_state': [1, 1, 1, 0], 'travel_time': True, 'velocity': 3})[source]
Inits RideshareGraphEnvironment with the given configuration.
- Parameters
config – A dictionary containing the initial configuration of the rideshare graph environment.
- find_lengths(graph, num_nodes)[source]
Find the lengths between each pair of nodes in [graph].
Given a graph, find_lengths first calculates the pairwise shortest distance between all the nodes, which is stored in a (symmetric) matrix.
- Parameters
graph – An object containing nodes and edges; each edge has a travel time.
num_nodes – An integer representing the number of nodes in the graph.
- Returns
A 2-dimensional symmetric array containing the distances between each pair of nodes.
- fulfill_req(state, dispatch, sink)[source]
Update the state to represent a car moving from source to sink.
- Parameters
dispatch – An integer representing the dispatched node for the rideshare request.
sink – An integer representing the destination node of the rideshare request.
- step(action)[source]
Move one step in the environment.
- Parameters
action – An Integer representing the node selected by the agent to service the request.
Returns: A 3-tuple consisting of the following elements:
An updated representation of the state, including updated car locations resulting from the previous dispatch and a new ride request,
An integer reward value based on the action,
A boolean indicating whether or not the model has reached the limit timestep.