MDP Toolbox for MATLAB

mdp_computePR

Computes the reward associated to a state/action pair.

Syntax

PR = mdp_computePR(P, R)

Description

mdp_computePR computes the reward of a state/action pair, given a probability array P and a reward array possibly depending on arrival state.

Arguments

P can be a 3 dimensions array (SxSxA) or a cell array (1xA), each cell containing a sparse matrix (SxS). R can be a 3 dimensions array (SxSxA) or a cell array (1xA), each cell containing a sparse matrix (SxS) or a 2D array (SxA), possibly sparse.

Evaluation

PR is a (SxA) matrix.

Example

>> P(:, :, 1) = [0.6116 0.3884;  0 1.0000];
>> P(:, :, 2) = [0.6674 0.3326;  0 1.0000];
>> R(:, :, 1) = [-0.2433 0.7073;  0 0.1871];
>> R(:, :, 2) = [-0.0069 0.6433;  0 0.2898];

>> PR = mdp_computePR(P, R)
PR =
   0.1259    0.2094
   0.1871    0.2898

In the above example, P can be a cell array containing sparse matrices:
>> P{1} = sparse([0.6116 0.3884;  0 1.0000]);
>> P{2} = sparse([0.6674 0.3326;  0 1.0000]);
The function call is unchanged.


MDP Toolbox for MATLAB



File : MDPtoolbox/documentation/mdp_computePR.html
Page created on July 31, 2001. Last update on August 31, 2009.