|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjmarkov.jmdp.solvers.Solver<S,A>
jmarkov.jmdp.solvers.AbstractInfiniteSolver<S,A>
jmarkov.jmdp.solvers.AbstractDiscountedSolver<S,A>
jmarkov.jmdp.solvers.PolicyIterationSolver<S,A>
S
- States class.A
- Actions class.public class PolicyIterationSolver<S extends State,A extends Action>
This class solves infinite horizon discounted problems using the policy iteration algorithm. It extends Solver and should only be used on infinite horizon problems. The objective function the solver uses is the discounted cost. The result is a deterministic optimal policy for the given structure. Policy Iteration is a solver method this is always convergent in a finite number of iterations. The algorithm has to solve a linear system of equations as big as the amount of states. When there are too many states, it is recommendable to use other solvers, or using the modified policy iteration (by using the second constructor). Policy Iteration is a solver method this is always convergent in a finite number of iterations. The algorithm has to solve a linear system of equations as big as the amount of states. When there are too many states, it is recommendable to use other solvers. The advantage of using Policy Iteration is that the result is the true optimal solution and not an aproximation as in other common methods. The method starts with a policy. It solves the system of linear equations for the value functions for that policy. With this values it looks for a better policy. It then solves the value functions again and looks for a better policy. If this policy is equal to the last policy tried, it stops, in any other case it keeps improving the policy and updating the value functions.
Field Summary | |
---|---|
protected long |
iterations
Used to store the number of iterations |
protected long |
processTime
Used to store process time |
Fields inherited from class jmarkov.jmdp.solvers.AbstractDiscountedSolver |
---|
discountFactor |
Fields inherited from class jmarkov.jmdp.solvers.Solver |
---|
policy, printProcessTime, printValueFunction, problem, solved, valueFunction |
Constructor Summary | |
---|---|
PolicyIterationSolver(DTMDP<S,A> problem,
double discountFactor)
The constructor method exclusively receives a problem of the type InfiniteMDP because this solver is only designed to work on infinite horizon problems. |
|
PolicyIterationSolver(DTMDP<S,A> problem,
double discountFactor,
boolean setModifiedPolicy)
The constructor method exclusively receives a problem of the type InfiniteMDP because this solver is only designed to work on infinite horizon problems. |
Method Summary | |
---|---|
java.lang.String |
description()
This method return a complete verbal describtion of this element. |
double |
getIncreasingFactor()
|
double |
getInitialIterations()
|
long |
getIterations()
|
long |
getProcessTime()
|
java.lang.String |
label()
The sub classes must return the Solver name. |
void |
setIncreasingFactor(double increasingFactor)
Sets the increasing factor of the maximum iterations of the Modified policy iteration method. |
void |
setInitialIterations(int initialIterations)
Sets maximum iterations for the first run of the modified policy iteration. |
void |
setModifiedPolicy(boolean val)
Activates the modified policy iteration algorithm. |
Solution<S,A> |
solve()
Called to solve the problem. |
protected ValueFunction<S> |
solveMatrix()
This method is used by the PolicyIterationSolver to solve the linear system of equations to determine the value functions of each state for a given policy. |
protected ValueFunction<S> |
solveMatrixModified(DecisionRule<S,A> localDecisionRule)
This method is used by the PolicyIterationSolver to solve the linear system of equations to determine the value functions of each state for a given policy. |
Methods inherited from class jmarkov.jmdp.solvers.AbstractDiscountedSolver |
---|
future, future, getInterestRate, setDiscountFactor, setInterestRate |
Methods inherited from class jmarkov.jmdp.solvers.AbstractInfiniteSolver |
---|
getDiscreteProblem, getProblem, printSolution |
Methods inherited from class jmarkov.jmdp.solvers.Solver |
---|
getOptimalPolicy, getOptimalValueFunction, getValueFunction, isSolved, printSolution, setPrintProcessTime, setPrintValueFunction, toString |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface jmarkov.basic.JMarkovElement |
---|
equals |
Field Detail |
---|
protected long iterations
protected long processTime
Constructor Detail |
---|
public PolicyIterationSolver(DTMDP<S,A> problem, double discountFactor)
problem
- the structure of the problem of type InfiniteMDPdiscountFactor
- represents how much less is the reward
received in the next period instead of receiving it in
the present period.public PolicyIterationSolver(DTMDP<S,A> problem, double discountFactor, boolean setModifiedPolicy)
problem
- the structure of the problem of type InfiniteMDPdiscountFactor
- represents how much less is the reward
received in the next period instead of receiving it in
the present period.setModifiedPolicy
- Method Detail |
---|
public double getIncreasingFactor()
public void setIncreasingFactor(double increasingFactor)
increasingFactor
- greater that 1. Determines max
iterations growth.public double getInitialIterations()
public void setInitialIterations(int initialIterations)
initialIterations
- public Solution<S,A> solve() throws SolverException
Solver
solve
in class Solver<S extends State,A extends Action>
SolverException
- This exception is thrown if the solver cannot find a solution
for some reason.protected ValueFunction<S> solveMatrix() throws SolverException
SolverException
protected ValueFunction<S> solveMatrixModified(DecisionRule<S,A> localDecisionRule)
public void setModifiedPolicy(boolean val)
val
- True if the modified policy iteration is to be used.public java.lang.String description()
JMarkovElement
description
in interface JMarkovElement
description
in class Solver<S extends State,A extends Action>
JMarkovElement.label()
public java.lang.String label()
Solver
label
in interface JMarkovElement
label
in class Solver<S extends State,A extends Action>
Solver.toString()
public final long getProcessTime()
getProcessTime
in class Solver<S extends State,A extends Action>
public final long getIterations()
getIterations
in class AbstractInfiniteSolver<S extends State,A extends Action>
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |