Abstract: In many applicational areas there is a need to determine a control variable that optimizes a pre-specified objective. This problem is particularly challenging when knowledge on the underlying dynamics is subject to various sources of uncertainty. A scenario such as that arises for instance in the context of therapy individualization to improve the efficacy and safety of medical treatment. Mathematical models describing the pharmacokinetics and pharmacodynamics of a drug together with data on associated biomarkers can be leveraged to support decision-making by predicting therapy outcomes. We present a continuous learning strategy which follows a novel sequential Monte Carlo tree search approach and explore how the underlying uncertainties reflect in the approximated control variable.