Files
Henrotte_24521800_2023.pdf
Open access - Adobe PDF
- 57.62 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- The present work aims to integrate two extensions of the randomized shortest-paths (RSP) framework. The first one investigates a new algorithm for computing the multiple inputs and outputs optimal transport on a graph problem through the use of an equivalent extended single input and output graph. The second one considers a different point of view on Markov decision processes (MDP) by reinterpreting them as a RSP problem on a bipartite graph. This work combines therefore both graph structures which results in two sets of action and state nodes containing additionnal nodes that allows for an initial multiple inputs and outputs bipartite graph to be turned into an equivalent single input and output bipartite graph. It gathers the properties of both approaches meaning it can accommodate edge flow capacity constraints and random behaviors provided by the environment. The MDP reformulation also allows using Tsallis r-divergence instead of Kullback-Leibler divergence which provides sparse policies. The main advantage of this method is that it is quite general and it can be used in a large range of problems. The resulting optimal randomized policy, in this context, corresponds to the state-to-action probabilities while the action-to-state transition probabilities remain fixed by the environment. This policy is ranging from a purely rational to a random behavior depending on a temperature parameter which brings a trade-off between the cost and the relative entropy regularization term. The extension is illustrated on a local road network of Louvain-la-Neuve.