Environment rewards across endophenotypes controlled by the RL model
Transition Description | Probability for each phase | ||||
---|---|---|---|---|---|
P (f1) | P (f2) | P (f3) | P (f4) | ||
T(s=i|s=i,a=as=i), i neutral state | 0 | 0 | 0 | 0 | From States |
T(s=i+j|s=i,a=as=i+j), j=+1/-1, i neutral state, i+j neutral state | 0 | 0 | 0 | 0 | |
T(s=i|s=i,a=as=i+j), j=+1/-1, i neutral state, i+j neutral state | 0 | 0 | 0 | 0 | |
T(s=i+k|s=i,a=as=i+k),k!=+1/-1, i neutral state, i+k neutral state | -0.3 | -0.3 | -0.3 | -0.3 | |
T(s=i|s=i,a=as=i+k),k!=+1/-1, i neutral state, i+k neutral state | 0 | 0 | 0 | 0 | |
T(s=i|s=i,a=aw), i neutral state | 0 | 0 | 0 | 0 | |
T(s=1|s=2,a=ag) | 0 | 0 | 0 | 0 | |
T(s=i|s=i,a=ag), i!=2 neutral state | 0 | 0 | 0 | 0 | |
T(s=8|s=7,a=ad) | 0 | 10 | -1 | 10 | |
T(s=i|s=i,a=ad), i!=7 neutral state | 0 | 0 | 0 | 0 | |
T(s=i|s=i,a=ag), i drug/aft state | -0.3 | -1.2 | -1.2 | -1.2 | From Drug/aft States |
T(s=4|s=i,a=ag), i drug/aft state | -4 | -4 | -4 | -4 | |
T(s=i|s=i,a=as=*), i drug/aft state | -0.3 | -1.2 | -1.2 | -1.2 | |
T(s=4|s=i,a=as=*), i drug/aft state | -4 | -4 | -4 | -4 | |
T(s=j|s=i,a=aw), i!=15 drug/aft state, j next or previous drug/aft state | -0.3 | -1.2 | -1.2 | -1.2 | |
T(s=4|s=i,a=aw), i!=15 drug/aft state | -4 | -4 | -4 | -4 | |
T(s=14/16|s=15,a=aw) | -0.3 | -1.2 | -1.2 | -1.2 | |
T(s=4|s=15,a=aw) | -4 | -4 | -4 | -4 | |
T(s=j|s=i,a=ad), i drug/aft state, j next drug/aft state | -0.3 | -1.2 | -1.2 | -1.2 | |
T(s=j|s=i,a=ad), i drug/aft state, j previous drug/aft state | -0.3 | -1.2 | -1.2 | -1.2 | |
T(s=4|s=i,a=ad), i drug/aft state | -4 | -4 | -4 | -4 | |
T(s=4|s=1,a=ag) | 1 | 1 | 1 | 1 | Goal |
T(s=1|s=1,a=as=*) | 0 | 0 | 0 | 0 | |
T(s=1|s=1,a=aw) | 0 | 0 | 0 | 0 | |
T(s=1|s=1,a=ad) | 0 | 0 | 0 | 0 |
Changes during phases in italic.