|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjmdp.solvers.Solver<S,A>
jmdp.solvers.AbstractInfiniteSolver<S,A>
jmdp.solvers.AbstractDiscountedSolver<S,A>
jmdp.solvers.ValueIterationSolver<S,A>
public class ValueIterationSolver<S extends State,A extends Action>
This class belongs to the set of default solvers included in the jmdp package. It extends Solver and should only be used on INFINITE horizon problems. The objective is to be able to return an optimal policy given a problem structure.
Field Summary | |
---|---|
protected boolean |
average
|
protected A |
bestAction
|
protected double |
epsilon
|
protected boolean |
errorBounds
|
protected boolean |
gaussSeidel
|
protected double |
initVal
|
protected int |
iterations
|
protected boolean |
modifiedAverage
|
protected long |
processTime
|
Fields inherited from class jmdp.solvers.AbstractDiscountedSolver |
---|
discountFactor, interestRate |
Fields inherited from class jmdp.solvers.Solver |
---|
policy, printProcessTime, printValueFunction, problem, solved, valueFunction |
Constructor Summary | |
---|---|
ValueIterationSolver(CTMDP<S,A> problem,
double discountFactor)
This constructor method exclusively receives a problem of the type CTMDP because this solver is only designed to work on infinite horizon problems. |
|
ValueIterationSolver(DTMDP<S,A> problem,
double interestRate)
This constructor method exclusively receives a problem of the type DTMDP because this solver is only designed to work on infinite horizon problems. |
Method Summary | |
---|---|
protected double |
bestAction(S i)
Sets the best action to take in state i, in the static variable bestAction. |
protected double |
computeNoErrorBounds()
|
protected double |
computeWithErrorBounds()
|
protected double |
future(S i,
A a,
double discountF)
|
protected double |
future(S i,
A a,
double discountF,
ValueFunction<S> vf)
Expected value of valueFunction for the current state and a specified action. |
int |
getIterations()
|
long |
getProcessTime()
|
protected void |
init()
Initializes the valueFunction for all the states. |
void |
setEpsilon(double epsilon)
Value Iteration is a solver method this is theoretically convergent only after infinite iterations. |
void |
setGaussSeidel(boolean val)
The GaussSeidel modification of the ValueIteration method is a change that is garanteed to have a performance at least as good as the methos without the modifications. |
void |
setInitVal(double val)
All the states have an initial valueFunction that by default is 1. |
Solution<S,A> |
solve()
Called to solve the problem. |
java.lang.String |
toString()
The sub classes must return the Solver name. |
void |
useErrorBounds(boolean val)
The ErrorBounds modification to the ValueIteration method is a change that is garanteed to have a performance at least as good as the methos without the modifications. |
Methods inherited from class jmdp.solvers.AbstractDiscountedSolver |
---|
getInterestRate, setDiscountFactor, setInterestRate |
Methods inherited from class jmdp.solvers.AbstractInfiniteSolver |
---|
getDiscreteProblem, getProblem, printSolution |
Methods inherited from class jmdp.solvers.Solver |
---|
getOptimalPolicy, getOptimalValueFunction, getValueFunction, isSolved, printSolution, setPrintProcessTime, setPrintValueFunction |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected double epsilon
protected double initVal
protected boolean gaussSeidel
protected boolean errorBounds
protected boolean average
protected long processTime
protected int iterations
protected A extends Action bestAction
protected boolean modifiedAverage
Constructor Detail |
---|
public ValueIterationSolver(DTMDP<S,A> problem, double interestRate)
problem
- the structure of the problem of type DTMDPinterestRate
- represents how much less is the reward received in the next
period instead of receiving it in the present period.public ValueIterationSolver(CTMDP<S,A> problem, double discountFactor)
problem
- the structure of the problem of type CTMDPdiscountFactor
- represents how much less is the reward received in the next
period instead of receiving it in the present period.Method Detail |
---|
public void setEpsilon(double epsilon)
epsilon
- maximum difference between iterations.public void setInitVal(double val)
val
- inital valueFunction for all states.public void setGaussSeidel(boolean val)
val
- sets whether or not the GaussSeidel modification will be used.useErrorBounds(boolean)
public void useErrorBounds(boolean val)
val
- sets whether or not to use the ErrorBounds modification.public Solution<S,A> solve()
Solver
solve
in class Solver<S extends State,A extends Action>
protected void init()
protected double computeNoErrorBounds()
protected double computeWithErrorBounds()
protected double future(S i, A a, double discountF, ValueFunction<S> vf)
discountF
- is the rate for discounting from one period to another. It
means how much less it would represent to receive one unit of
the reward in the next period instead of receiving it in the
present period.protected double future(S i, A a, double discountF)
protected double bestAction(S i)
i
- state for which the best action is being determined
public final long getProcessTime()
getProcessTime
in class Solver<S extends State,A extends Action>
public final int getIterations()
getIterations
in class AbstractInfiniteSolver<S extends State,A extends Action>
public java.lang.String toString()
Solver
toString
in class Solver<S extends State,A extends Action>
Object.toString()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |