Table 3.

Optimal policy across endophenotypes controlled by the RL model (2nd drug phase)

State numberState typeActionQ value
1Goalag2.8967
2Neutralag2.607
3Neutralas=22.3439
4Neutralas=32.1074
5Neutralas=41.8948
6Neutralas=51.7036
7Neutralas=61.5317
8Drugad-10.1134
9Drug-aftereffectad-10.3781
10Drug-aftereffectaw-10.4882
11Drug-aftereffectaw-10.2809
12Drug-aftereffectaw-9.7099
13Drug-aftereffectaw-8.6469
14Drug-aftereffectaw-6.8532
15Drug-aftereffectaw-3.9265
16Drug-aftereffectad-5.2928
17Drug-aftereffectad-6.4251
18Drug-aftereffectad-7.3633
19Drug-aftereffectad-8.1408
20Drug-aftereffectad-8.7849
21Drug-aftereffectad-9.318
22Drug-aftereffectad-9.7575