jmarkov.jmdp.solvers
Class ValueIterationSolver<S extends State,A extends Action>

java.lang.Object
  extended by jmarkov.jmdp.solvers.Solver<S,A>
      extended by jmarkov.jmdp.solvers.AbstractInfiniteSolver<S,A>
          extended by jmarkov.jmdp.solvers.AbstractDiscountedSolver<S,A>
              extended by jmarkov.jmdp.solvers.ValueIterationSolver<S,A>
Type Parameters:
S - States Class.
A - Actions class.
All Implemented Interfaces:
JMarkovElement

public class ValueIterationSolver<S extends State,A extends Action>
extends AbstractDiscountedSolver<S,A>

This class belongs to the set of default solvers included in the jmdp package. It extends Solver and should only be used on INFINITE horizon problems. The objective is to be able to return an optimal policy given a problem structure.

Author:
Andres Sarmiento, Germán Riaño - Universidad de Los Andes

Field Summary
protected  long iterations
          Used to store the number of iterations
protected  long processTime
          stores the process time
 
Fields inherited from class jmarkov.jmdp.solvers.AbstractDiscountedSolver
discountFactor
 
Fields inherited from class jmarkov.jmdp.solvers.Solver
policy, printProcessTime, printValueFunction, problem, solved, valueFunction
 
Constructor Summary
ValueIterationSolver(CTMDP<S,A> problem, double interestRate)
          Default Constructor for continuous time problems.
ValueIterationSolver(DTMDP<S,A> problem, double interestRate)
          Default Constructor for Discrte time problems.
 
Method Summary
protected  double bestAction(S i)
          Find the minimal value function for this state and sets the best action to take in state i, in the variable bestAction.
protected  double computeNoErrorBounds()
          Computes an iteration of the Value Iteration Algorithm without the use of error bounds.
protected  double computeWithErrorBounds()
          Computes an iteration of the Value Iteration Algorithm with the use of error bounds.
 java.lang.String description()
          This method return a complete verbal describtion of this element.
 double getEpsilon()
           
 long getIterations()
           
 long getProcessTime()
           
protected  void init()
          Initializes the valueFunction for all the states.
 boolean isAverage()
           
 java.lang.String label()
          The sub classes must return the Solver name.
 void setEpsilon(double epsilon)
          Value Iteration is a solver method this is theoretically convergent only after infinite iterations.
 Solution<S,A> solve()
          Solves the problem.
 void useErrorBounds(boolean val)
          The ErrorBounds modification to the ValueIteration method is a change that is garanteed to have a performance at least as good as the methos without the modifications.
 void useGaussSeidel(boolean val)
          The GaussSeidel modification of the ValueIteration method is a change that is garanteed to have a performance at least as good as the methos without the modifications.
 boolean usesErrorBounds()
           
 boolean usesGaussSeidel()
           
 
Methods inherited from class jmarkov.jmdp.solvers.AbstractDiscountedSolver
future, future, getInterestRate, setDiscountFactor, setInterestRate
 
Methods inherited from class jmarkov.jmdp.solvers.AbstractInfiniteSolver
getDiscreteProblem, getProblem, printSolution
 
Methods inherited from class jmarkov.jmdp.solvers.Solver
getOptimalPolicy, getOptimalValueFunction, getValueFunction, isSolved, printSolution, setPrintProcessTime, setPrintValueFunction, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface jmarkov.basic.JMarkovElement
equals
 

Field Detail

processTime

protected long processTime
stores the process time


iterations

protected long iterations
Used to store the number of iterations

Constructor Detail

ValueIterationSolver

public ValueIterationSolver(DTMDP<S,A> problem,
                            double interestRate)
Default Constructor for Discrte time problems.

Parameters:
problem - the structure of the problem of type DTMDP
interestRate - represents how much less is the reward received in the next period instead of receiving it in the present period.

ValueIterationSolver

public ValueIterationSolver(CTMDP<S,A> problem,
                            double interestRate)
Default Constructor for continuous time problems.

Parameters:
problem - the structure of the problem of type CTMDP
interestRate - represents how much less is the reward received in the next period instead of receiving it in the present period.
Method Detail

setEpsilon

public void setEpsilon(double epsilon)
Value Iteration is a solver method this is theoretically convergent only after infinite iterations. Because of the practical impossibility to do this, the solver is designed to stop when the difference between iterations is as much as epsilon. The smaller epsilon is, the closer the result will be to the actual optimum but it will take a longer time to solve the problem. The default value of epsilon is 0.0001.

Parameters:
epsilon - maximum difference between iterations.

useGaussSeidel

public void useGaussSeidel(boolean val)
The GaussSeidel modification of the ValueIteration method is a change that is garanteed to have a performance at least as good as the methos without the modifications. In many problems, specially the ones with many states, the modification can imply a significant improvement. By default it set to true. It provides no significant improvement if used jointly with the ErrorBounds modification.

Parameters:
val - sets whether or not the GaussSeidel modification will be used.
See Also:
useErrorBounds(boolean)

getEpsilon

public final double getEpsilon()
Returns:
Returns the epsilon.

isAverage

public final boolean isAverage()
Returns:
Returns the isAverage.

usesErrorBounds

public final boolean usesErrorBounds()
Returns:
Returns true if uses Error Bounds.

usesGaussSeidel

public final boolean usesGaussSeidel()
Returns:
Returns true if Gauss Seidel is active.

useErrorBounds

public void useErrorBounds(boolean val)
The ErrorBounds modification to the ValueIteration method is a change that is garanteed to have a performance at least as good as the methos without the modifications. In many problems, specially the ones with many states, the modification can imply a significant improvement. This method modifies the iteratios and the stopping criterion. It builds upper and lower bounds for the optimal in each iteration and stops when the bounds are only delta apart or less ignoring where the actual valueFunction is. The bounds converge faster than the actual valueFunction. By default it set to false.

Parameters:
val - sets whether or not to use the ErrorBounds modification.

solve

public Solution<S,A> solve()
Solves the problem.

Specified by:
solve in class Solver<S extends State,A extends Action>
Returns:
returns a Solution with the optimal policy and value funtion.

init

protected void init()
Initializes the valueFunction for all the states.


computeNoErrorBounds

protected double computeNoErrorBounds()
Computes an iteration of the Value Iteration Algorithm without the use of error bounds.

Returns:
maximum change in value function due to this iteration.

computeWithErrorBounds

protected double computeWithErrorBounds()
Computes an iteration of the Value Iteration Algorithm with the use of error bounds.

Returns:
maximum change in value function due to this iteration.

bestAction

protected double bestAction(S i)
Find the minimal value function for this state and sets the best action to take in state i, in the variable bestAction.

Parameters:
i - state for which the best action is being determined
Returns:
the new ValueFunction for this state.

getProcessTime

public final long getProcessTime()
Specified by:
getProcessTime in class Solver<S extends State,A extends Action>
Returns:
Returns the processTime.

getIterations

public final long getIterations()
Specified by:
getIterations in class AbstractInfiniteSolver<S extends State,A extends Action>
Returns:
Returns the iterations.

label

public java.lang.String label()
Description copied from class: Solver
The sub classes must return the Solver name.

Specified by:
label in interface JMarkovElement
Specified by:
label in class Solver<S extends State,A extends Action>
Returns:
A String label.
See Also:
Solver.toString()

description

public java.lang.String description()
Description copied from interface: JMarkovElement
This method return a complete verbal describtion of this element. This description may contain multiple text rows.

Specified by:
description in interface JMarkovElement
Overrides:
description in class Solver<S extends State,A extends Action>
Returns:
A String describing this element.
See Also:
JMarkovElement.label()