|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjmdp.solvers.Solver<S,A>
jmdp.solvers.AbstractInfiniteSolver<S,A>
jmdp.solvers.AbstractDiscountedSolver<S,A>
jmdp.solvers.PolicyIterationSolver<S,A>
public class PolicyIterationSolver<S extends State,A extends Action>
This class belongs to the set of default solvers included in the jmdp package. It extends Solver and should only be used on INFINITE horizon problems. The objective function the solver uses is the discounted cost. The result is a deterministic optimal policy for the given structure.
Field Summary | |
---|---|
protected double |
epsilon
|
protected boolean |
errorBounds
|
protected boolean |
gaussSeidel
|
protected int |
iterations
|
protected java.util.List<S> |
localStates
|
protected long |
processTime
|
Fields inherited from class jmdp.solvers.AbstractDiscountedSolver |
---|
discountFactor, interestRate |
Fields inherited from class jmdp.solvers.Solver |
---|
policy, printProcessTime, printValueFunction, problem, solved, valueFunction |
Constructor Summary | |
---|---|
PolicyIterationSolver(DTMDP<S,A> problem,
double discountFactor)
The constructor method exclusively receives a problem of the type InfiniteMDP because this solver is only designed to work on infinite horizon problems. |
|
PolicyIterationSolver(DTMDP<S,A> problem,
double discountFactor,
boolean setModifiedPolicy)
The constructor method exclusively receives a problem of the type InfiniteMDP because this solver is only designed to work on infinite horizon problems. |
Method Summary | |
---|---|
jmp.SparseRowColumnMatrix |
buildMatrix(DecisionRule<S,A> localPolicy)
This method is, until now, only used by the PolicyIterationSolver. |
protected double |
future(S i,
A a,
double discountF)
|
protected double |
future(S i,
A a,
double discountF,
ValueFunction<S> vf)
Expected value of valueFunction for the current state and a specified action. |
double |
getIncreasingFactor()
|
double |
getInitialIterations()
|
int |
getIterations()
|
long |
getProcessTime()
|
void |
setIncreasingFactor(double increasingFactor)
Sets the increasing factor of the maximum iterations of the Modified policy iteration method. |
void |
setInitialIterations(double initialIterations)
Sets maximum iterations for the first run of the modified policy iteration. |
void |
setModifiedPolicy(boolean val)
Activates the modified policy iteration algorithm. |
Solution<S,A> |
solve()
Policy Iteration is a solver method this is always convergent in a finite number of iterations. |
protected ValueFunction<S> |
solveMatrix(DecisionRule<S,A> localDecisionRule)
This method is used by the PolicyIterationSolver to solve the linear system of equations to determine the value functions of each state for a given policy. |
protected ValueFunction<S> |
solveMatrixModified(DecisionRule<S,A> localDecisionRule)
This method is used by the PolicyIterationSolver to solve the linear system of equations to determine the value functions of each state for a given policy. |
java.lang.String |
toString()
The sub classes must return the Solver name. |
Methods inherited from class jmdp.solvers.AbstractDiscountedSolver |
---|
getInterestRate, setDiscountFactor, setInterestRate |
Methods inherited from class jmdp.solvers.AbstractInfiniteSolver |
---|
getDiscreteProblem, getProblem, printSolution |
Methods inherited from class jmdp.solvers.Solver |
---|
getOptimalPolicy, getOptimalValueFunction, getValueFunction, isSolved, printSolution, setPrintProcessTime, setPrintValueFunction |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected java.util.List<S extends State> localStates
protected int iterations
protected long processTime
protected double epsilon
protected boolean gaussSeidel
protected boolean errorBounds
Constructor Detail |
---|
public PolicyIterationSolver(DTMDP<S,A> problem, double discountFactor)
problem
- the structure of the problem of type InfiniteMDPdiscountFactor
- represents how much less is the reward received in the next
period instead of receiving it in the present period.public PolicyIterationSolver(DTMDP<S,A> problem, double discountFactor, boolean setModifiedPolicy)
problem
- the structure of the problem of type InfiniteMDPdiscountFactor
- represents how much less is the reward received in the next
period instead of receiving it in the present period.Method Detail |
---|
public double getIncreasingFactor()
public void setIncreasingFactor(double increasingFactor)
increasingFactor
- greater that 1. Determines max iterations growth.public double getInitialIterations()
public void setInitialIterations(double initialIterations)
initialIterations
- public Solution<S,A> solve()
solve
in class Solver<S extends State,A extends Action>
public jmp.SparseRowColumnMatrix buildMatrix(DecisionRule<S,A> localPolicy)
localPolicy
- the policy under which the probability matrix is to be built.
protected ValueFunction<S> solveMatrix(DecisionRule<S,A> localDecisionRule)
protected ValueFunction<S> solveMatrixModified(DecisionRule<S,A> localDecisionRule)
protected final double future(S i, A a, double discountF, ValueFunction<S> vf) throws java.lang.NullPointerException
discountF
- is the rate for discounting from one period to another. It
means how much less it would represent to receive one unit of
the reward in the next period instead of receiving it in the
present period.
java.lang.NullPointerException
protected double future(S i, A a, double discountF)
public void setModifiedPolicy(boolean val)
public java.lang.String toString()
Solver
toString
in class Solver<S extends State,A extends Action>
Object.toString()
public final long getProcessTime()
getProcessTime
in class Solver<S extends State,A extends Action>
public final int getIterations()
getIterations
in class AbstractInfiniteSolver<S extends State,A extends Action>
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |