MDP Toolbox for MATLAB |
Examples of MDP |
|
mdp_example_rand | Generates a random MDP problem |
mdp_example_forest | Generates a MDP for a simple forest management problem |
|
|
mdp_finite_horizon | Solves finite-horizon MDP using backwards induction algorithm |
|
|
mdp_LP | Solves discounted MDP using linear programming algorithm |
mdp_policy_iteration | Solves discounted MDP using policy iteration algorithm |
mdp_policy_iteration_modified | Solves discounted MDP using modified policy iteration algorithm |
mdp_value_iteration | Solves discounted MDP using value iteration algorithm |
mdp_value_iterationGS | Solves discounted MDP using Gauss-Seidel's value iteration algorithm |
mdp_Q_learning | Solves discounted MDP using the Q-learning algorithm (Reinforcement Learning) |
|
|
mdp_relative_value_iteration | Solves MDP with average reward using relative value iteration algorithm |
|
|
mdp_bellman_operator | Applies the Bellman operator |
mdp_check | Checks the validity of a MDP |
mdp_check_square_stochastic | Checks if a matrix is square and stochastic |
mdp_computePR | Computes a reward matrix for any form of transition and reward functions |
mdp_computePpolicyPRpolicy | Computes the transition matrix and the reward matrix for a fixed policy |
mdp_eval_policy_iterative | Evaluates a policy using an iterative method |
mdp_eval_policy_matrix | Evaluates a policy using matrix inversion and product |
mdp_eval_policy_TD_0 | Evaluates a policy using the TD(0) algorithm (Reinforcement Learning) |
mdp_eval_policy_optimality | Computes sets of 'near optimal' actions for each state |
mdp_span | Evaluates the span of a vector |
mdp_value_iteration_bound_iter | Computes a bound on the number of iterations for the value iteration algorithm |
mdp_silent, mdp_verbose | Calls silent or verbose running mode |
MDP Toolbox for MATLAB |