Document Type

Thesis - Open Access

Award Date


Degree Name

Master of Science (MS)


Electrical Engineering and Computer Science

First Advisor

Qiquan Qiao


Personalized medicine, Q-functions, Q-learning, Reinforcement Learning, SMART Design, Treatment regimen


Nowadays, pharmacological practices are focused on a single best treatment to treat a disease which sounds impractical as the same treatment may not work the same way for every patient. Thus, there is a need of shift towards more patient-centric rather than disease-centric approach, in which personal characteristics of a patient or biomarkers are used to determine the tailored optimal treatment. The “one size fits all” concept is contradicted by research area of personalized medicine. The Sequential Multiple Assignment Randomized Trial (SMART) is a multi-stage trials to inform the development of dynamic treatment regimens (DTR’s). In SMART, a subject is randomized through various stages of treatment where each stage corresponds to a treatment decision. These types of adaptive interventions are individualized and are repeatedly adjusted across time based on patient’s individual clinical characteristics and ongoing performance. The reinforcement learning (Q-learning), a computational algorithm for optimization of treatment regimens to maximize desired clinical outcome is used in optimizing the sequence of treatments. This statistical model contains regression analysis for function approximation of data from clinical trials. The model will predict a series of regimens across time, depending on the biomarkers of a new participant for optimizing the weight management decision rules. Additionally, for implementing reinforcement learning algorithm, as it is one of the machine learning approach there should be a training data from which we can train the model or in other words approximate the function, Q-functions. Then the approximated functions of the model should be evaluated and after the evaluation they should be further tested for applying the treatment rule to future patients. Thus, in this thesis first the dataset obtained from Sanford Health is first restructured, to make it conducive for our model utilization. The restructured training data is used in regression analysis for approximating the Q-functions. The regression analysis gives the estimates of coefficients associated to each covariate in the regression function. The evaluation of model goodness-of-fit and fulfillment of underlying assumptions of simple linear regression are performed using regression summary table and residual diagnostic plots. As a two stage SMART design is put into practice, the Q-functions for these two stages are needed to be estimated through multiple regression using linear model. Now, finally after analyzing the fit adequacy the model is applied for prescribing treatment rules to future patients. The prognostic and predictive covariates of new patient is acquired and the optimal treatment rule for each treatment decision stage is assigned as the treatment that results in maximum estimated values of Q-functions. The estimated values of each regime were also computed using the value estimator function and regime that has the maximum estimated value was chosen as optimal treatment decision rule.

Library of Congress Subject Headings

Personalized medicine.
Reinforcement learning.
Clinical trials.


Includes bibliographical references (pages89-96)



Number of Pages



South Dakota State University


In Copyright - Educational Use Permitted

Included in

Biomedical Commons