The mDOT Center

Transforming health and wellness via temporally-precise mHealth interventions
mDOT@MD2K.org
901.678.1526
 

CP 8: Center for Methodologies for Adapting and Personalizing Prevention, Treatment and Recovery Services for SUD and HIV (MAPS Center)

mDOT Center > CP 8: Center for Methodologies for Adapting and Personalizing Prevention, Treatment and Recovery Services for SUD and HIV (MAPS Center)

CP 8: Center for Methodologies for Adapting and Personalizing Prevention, Treatment and Recovery Services for SUD and HIV (MAPS Center)

30

Collaborating Investigator:

Dr. Inbal Nahum-Shani, University of Michigan

 

Funding Status: 

P50DA054039-01

NIH/NIDA

9/1/21 – 6/30/26

 

Associated with:

TR&D2

Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines
Authors:
Publication Venue:

Algorithms in Decision Support Systems

Publication Date:

July 22, 2022

Keywords:

reinforcement learning, online learning, mobile health, algorithm design, algorithm evaluation

Related Project:
Abstract:
Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurate mechanistic models for the user dynamics. To guide how one can tackle these challenges, we extend the PCS (predictability, computability, stability) framework, a data science framework that incorporates best practices from machine learning and statistics in supervised learning to the design of RL algorithms for the digital interventions setting. Furthermore, we provide guidelines on how to design simulation environments, a crucial tool for evaluating RL candidate algorithms using the PCS framework. We show how we used the PCS framework to design an RL algorithm for Oralytics, a mobile health study aiming to improve users’ tooth-brushing behaviors through the personalized delivery of intervention messages. Oralytics will go into the field in late 2022.
TL;DR:

Online RL faces challenges like real-time stability and handling complex, unpredictable environments; to address these issues, the PCS framework originally used in supervised learning is extended to guide the design of RL algorithms for such settings, including guidelines for creating simulation environments, as exemplified in the development of an RL algorithm for the mobile health study Oralytics aimed at enhancing tooth-brushing behaviors through personalized intervention messages.

Reward Design for an Online Reinforcement Learning Algorithm Supporting Oral Self-Care
Authors:
Publication Venue:

Conference on Innovative Applications of Artificial Intelligence (IAAI 2023)

Publication Date:

February 7, 2023

Keywords:

reinforcement learning, online learning, mobile health, algorithm design, algorithm evaluation

Related Project:
Abstract:

Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. One of the main challenges in developing such an algorithm is ensuring that the algorithm considers the impact of the current action on the effectiveness of future actions (i.e., delayed effects), especially when the algorithm has been made simple in order to run stably and autonomously in a constrained, real-world setting (i.e., highly noisy, sparse data). We address this challenge by designing a quality reward which maximizes the desired health outcome (i.e., high-quality brushing) while minimizing user burden. We also highlight a procedure for optimizing the hyperparameters of the reward by building a simulation environment test bed and evaluating candidates using the test bed. The RL algorithm discussed in this paper will be deployed in Oralytics, an oral self-care app that provides behavioral strategies to boost patient engagement in oral hygiene practices.

TL;DR:

In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. 

Statistical Inference after Adaptive Sampling for Longitudinal Data
Authors:
Publication Venue:
arXiv:2202.07098
Publication Date:

April 19, 2023

Keywords:
adaptive sampling algorithms, statistical inference, machine learning, longitudinal data
Related Projects:
Abstract:

Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or “pooling” data across users allows adaptive sampling algorithms to potentially learn faster. However, by pooling, these algorithms induce dependence between the sampled user data trajectories; we show that this can cause standard variance estimators for i.i.d. data to underestimate the true variance of common estimators on this data type. We develop novel methods to perform a variety of statistical analyses on such adaptively sampled data via Z-estimation. Specifically, we introduce the adaptive sandwich variance estimator, a corrected sandwich estimator that leads to consistent variance estimates under adaptive sampling. Additionally, to prove our results we develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest. This work is motivated by our efforts in designing experiments in which online reinforcement learning algorithms optimize treatment decisions, yet statistical inference is essential for conducting analyses after experiments conclude.

TL;DR:
In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or “pooling” data across users allows adaptive sampling algorithms to potentially learn faster.
Intervention Optimization: A Paradigm Shift & Its Potential Implications for Clinical Psychology
Authors:
Publication Venue:

To appear in Volume 20 of the Annual Review of Clinical Psychology, 2023

Publication Date:

2023

Keywords:

engagement, oral health, mobile health intervention, racial and ethnic minority group, message development

Related Projects:
Overview:
This work was part of our collaboration on the MiWaves Trial. More information on our work on the MiWaves trial can be found below:

We have been developing the MiWaves RL algorithm for cannabis reduction (CP8); this algorithm uses a more flexible approach to pooling data across participants. In particular, this new RL algorithm pools data across participants only to the extent that the participants respond similarly, thus if the accruing data indicates high heterogeneity between participants, then the algorithm will minimally pool their data in order to learn which intervention option to provide. To accomplish this, the MiWaves RL algorithm uses a classical tool from statistics (mixed effects models). The MiWaves trial is scheduled to be piloted in October 2023.
Assessing the Impact of Context Inference Error & Partial Observability on RL Methods for Just-In-Time Adaptive Interventions
Authors:
Publication Venue:

arXiv:2308.07843

Publication Date:

November 3, 2023

Keywords:

dyadic reinforcement learning, online learning, mobile health, algorithm design

Related Project:
Abstract:

Mobile health aims to enhance health outcomes by delivering interventions to individuals as they go about their daily life. The involvement of care partners and social support networks often proves crucial in helping individuals managing burdensome medical conditions. This presents opportunities in mobile health to design interventions that target the dyadic relationship — the relationship between a target person and their care partner — with the aim of enhancing social support. In this paper, we develop dyadic RL, an online reinforcement learning algorithm designed to personalize intervention delivery based on contextual factors and past responses of a target person and their care partner. Here, multiple sets of interventions impact the dyad across multiple time intervals. The developed dyadic RL is Bayesian and hierarchical. We formally introduce the problem setup, develop dyadic RL and establish a regret bound. We demonstrate dyadic RL’s empirical performance through simulation studies on both toy scenarios and on a realistic test bed constructed from data collected in a mobile health study.

TL;DR:

In this paper, we develop dyadic RL, an online reinforcement learning algorithm designed to personalize intervention delivery based on contextual factors and past responses of a target person and their care partner.

The focus of the MAPS Center is the development, evaluation, and dissemination of novel research methodologies that are essential to optimize adaptive interventions to combat SUD/HIV. Project 2 focuses on developing innovative methods that will enable scientists, for the first time, to optimize the integration of human-delivered services with relatively low-intensity adaptation (i.e., adaptive interventions) and digital services with high-intensity adaptation (i.e., JITAIs). This project will develop a new trial design in which individuals can be randomized simultaneously to human-delivered and digital interventions at different time scales. This includes developing guidelines for trial design, sample size calculators, and statistical analysis methods that will enable scientists to use data from the new experimental design to address novel questions about synergies between human-delivered adaptive interventions and digital JITAIs. Project 3 focuses on developing innovative methods to optimize JITAIs in which the decision rules are continually updated to ensure effective adaptation as individual needs change and societal trends occur. Integrating approaches from artificial intelligence and statistics, this project will develop algorithms that continually update “population-based” decision rules (designed to work well for all individuals on average) to improve intervention effectiveness. This project will also generalize these algorithms to continually optimize “person-specific” decision rules for JITAIs. The algorithms will be designed specifically to (a) assign each individual the intervention that is right for them at a particular moment; (b) maintain acceptable levels of burden; and (c) maintain engagement.

Project 3 of MAPS aims to collaborate with TR&D2 (Murphy) by developing methods for appropriately pooling of data from multiple users to speed up learning of both population-based decision rules as well as personalized decision rules. These collaborations will used to enhance the impact of TR&D2’s Aims 2 and 3 and thus lay the foundation for successful future research projects. Project 3 of MAPS aims to collaborate with TR&D1 (Marlin) by utilizing advances by TR&D1 in propagating and representing uncertainty in Project 3’s development of methods for adapting the timing and location of delivery of different intervention prompts. These collaborations will increase the impact of TR&D1’s Aims 1 and 2. Project 2 of MAPS plans to collaborate with TR&D1 (Marlin) to develop a composite substance use risk indicator derived from sensor data that can be assessed at different time scales and hence can inform the adaptation of both human-delivered and digital interventions; and to collaborate with TR&D2 (Murphy) to develop optimization methods for learning what type and under what conditions digital interventions are best delivered in a setting in which non-digital interventions (human-delivered interventions) are also provided– this is an extreme case of TR&D2’s Aim 3 focused on multiple intervention components delivered at different time scales and with different short-term objectives. As such this collaboration has the potential to synergistically enhance both TR&D’s as well as MAP’s Project 2 aims.

Category

CP, Drug Use, HIV, TR&D2

Share
No Comments

Post a Comment