CP 2: Personalized Digital Behavior Change Interventions to Promote Oral Health

Collaborating Investigator:

Dr. Vivek Shetty; University of California, Los Angeles

Funding Status:

UG3DE028723

NIH/NIDCR/HHS

4/1/19 – 3/31/27

Associated with:

TR&D1, TR&D2, TR&D3

Publications
Significance
Approach & Push Pull Relationship

TR&D2: Optimization

Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines

Authors:

Nassal, M. Sugavanam, N., Aramendi, E., Jaureguibeitia, X., Elola, A., Panchal, A., Ulintz, A., Wang, H., Ertin, E.

Publication Venue:

Proceedings of the 57th Annual Hawaii International Conference on System Sciences, HICSS 2024

Publication Date:

January 3, 2024

Keywords:

Artificial Intelligence (AI), cardiac arrest, resuscitation, end tidal capnography, reinforcement learning

View Full Paper

Related Project:

CP10

Abstract:

Artificial Intelligence (AI) and machine learning have advanced healthcare by defining relationships in complex conditions. Out-of-hospital cardiac arrest (OHCA) is a medically complex condition with several etiologies. Survival for OHCA has remained static at 10% for decades in the United States. Treatment of OHCA requires the coordination of numerous interventions, including the delivery of multiple medications. Current resuscitation algorithms follow a single strict pathway, regardless of fluctuating cardiac physiology. OHCA resuscitation requires a real-time biomarker that can guide interventions to improve outcomes. End tidal capnography (ETCO2) is commonly implemented by emergency medical services professionals in resuscitation and can serve as an ideal biomarker for resuscitation. However, there are no effective conceptual frameworks utilizing the continuous ETCO2 data. In this manuscript, we detail a conceptual framework using AI and machine learning techniques to leverage ETCO2 in guided resuscitation.

TL;DR:

This publication proposes a conceptual framework for utilizing Artificial Intelligence (AI) and machine learning to create End Tidal Capnography (ETCO2) guided resuscitation for Out-of-Hospital Cardiac Arrest (OHCA). The aim is to move beyond rigid, fixed-interval resuscitation algorithms by leveraging continuous ETCO2 data as a real-time biomarker, alongside other physiological measurements, to develop personalized, dynamic interventions that are responsive to a patient’s evolving cardiac physiology. This approach seeks to improve the currently static survival rates for OHCA by enabling a deeper analysis of ETCO2 trends in relation to patient characteristics and interventions, potentially revealing “hidden” patterns and allowing for reward-based algorithms to guide optimal treatment strategies.

Reward Design for an Online Reinforcement Learning Algorithm Supporting Oral Self-Care

Authors:

Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

Publication Venue:

Conference on Innovative Applications of Artificial Intelligence (IAAI 2023)

Publication Date:

February 7, 2023

Keywords:

reinforcement learning, online learning, mobile health, algorithm design, algorithm evaluation

View Full Paper

Related Project:

CP 2, CP 8

Abstract:

Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. One of the main challenges in developing such an algorithm is ensuring that the algorithm considers the impact of the current action on the effectiveness of future actions (i.e., delayed effects), especially when the algorithm has been made simple in order to run stably and autonomously in a constrained, real-world setting (i.e., highly noisy, sparse data). We address this challenge by designing a quality reward which maximizes the desired health outcome (i.e., high-quality brushing) while minimizing user burden. We also highlight a procedure for optimizing the hyperparameters of the reward by building a simulation environment test bed and evaluating candidates using the test bed. The RL algorithm discussed in this paper will be deployed in Oralytics, an oral self-care app that provides behavioral strategies to boost patient engagement in oral hygiene practices.

TL;DR:

In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors.

Statistical Inference after Adaptive Sampling for Longitudinal Data

Authors:

Kelly Zhang, Lucas Janson, Susan Murphy

Publication Venue:

arXiv:2202.07098

Publication Date:

April 19, 2023

Keywords:

adaptive sampling algorithms, statistical inference, machine learning, longitudinal data

View Full Paper

Related Projects:

Cp 2, CP 8

Abstract:

Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or “pooling” data across users allows adaptive sampling algorithms to potentially learn faster. However, by pooling, these algorithms induce dependence between the sampled user data trajectories; we show that this can cause standard variance estimators for i.i.d. data to underestimate the true variance of common estimators on this data type. We develop novel methods to perform a variety of statistical analyses on such adaptively sampled data via Z-estimation. Specifically, we introduce the adaptive sandwich variance estimator, a corrected sandwich estimator that leads to consistent variance estimates under adaptive sampling. Additionally, to prove our results we develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest. This work is motivated by our efforts in designing experiments in which online reinforcement learning algorithms optimize treatment decisions, yet statistical inference is essential for conducting analyses after experiments conclude.

TL;DR:

In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or “pooling” data across users allows adaptive sampling algorithms to potentially learn faster.

Online Learning in Bandits with Predicted Context

Authors:

Yongyi Guo, Ziping Xu, Susan Murphy

Publication Venue:

arXiv:2307.13916

Publication Date:

October 31, 2023

Keywords:

contextual bandits, predicted context, online learning, machine learning

View Full Paper

Related Projects:

Cp 2

Abstract:

We consider the contextual bandit problem where at each time, the agent only has access to a noisy version of the context and the error variance (or an estimator of this variance). This setting is motivated by a wide range of applications where the true context for decision-making is unobserved, and only a prediction of the context by a potentially complex machine learning algorithm is available. When the context error is non-vanishing, classical bandit algorithms fail to achieve sublinear regret. We propose the first online algorithm in this setting with sublinear regret guarantees under mild conditions. The key idea is to extend the measurement error model in classical statistics to the online decision-making setting, which is nontrivial due to the policy being dependent on the noisy context observations. We further demonstrate the benefits of the proposed approach in simulation environments based on synthetic and real digital intervention datasets.

TL;DR:

We propose the first online algorithm in this setting with sublinear regret guarantees under mild conditions.

Contextual Bandits with Budgeted Information Reveal

Authors:

Kyra Gan, Esmaeil Keyvanshokooh, Xueqing Liu, Susan Murphy

Publication Venue:

arXiv:2305.18511

Publication Date:

May 29, 2023

Keywords:

machine learning, optimization and control, contextual bandits, information reveal

View Full Paper

Related Projects:

Cp 2

Abstract:

Contextual bandit algorithms are commonly used in digital health to recommend personalized treatments. However, to ensure the effectiveness of the treatments, patients are often requested to take actions that have no immediate benefit to them, which we refer to as pro-treatment actions. In practice, clinicians have a limited budget to encourage patients to take these actions and collect additional information. We introduce a novel optimization and learning algorithm to address this problem. This algorithm effectively combines the strengths of two algorithmic approaches in a seamless manner, including 1) an online primal-dual algorithm for deciding the optimal timing to reach out to patients, and 2) a contextual bandit learning algorithm to deliver personalized treatment to the patient. We prove that this algorithm admits a sub-linear regret bound. We illustrate the usefulness of this algorithm on both synthetic and real-world data.

TL;DR:

We present an innovative optimization and learning algorithm to tackle the challenge clinicians face with constrained budgets, aiming to incentivize patients to take actions and gather additional information.

Developing Message Strategies to Engage Racial and Ethnic Minority Groups in Digital Oral Self-Care Interventions: Participatory Co-Design Approach

Authors:

Stephanie M Carpenter; Zara M Greer; Rebecca Newman; Susan A Murphy; Vivek Shetty; Inbal Nahum-Shani

Publication Venue:

JMIR Formative Research

Publication Date:

December 11, 2023

Keywords:

contextual bandits, bandit algorithms, non-stationarity

View Full Paper

Related Projects:

CP 2

Abstract:

This publication describes research focused on creating effective digital messages to promote oral self-care, particularly daily toothbrushing, among diverse racial and ethnic minority groups. The goal was to develop engaging and appealing content for a smartphone application that would encourage better oral health habits. The development process involved collaborative design with dental experts and members of the target population, using web-based sessions to gather feedback while prioritizing participant anonymity and confidentiality. The findings indicate a preference for clear, enthusiastic, and relatable messages, suggesting valuable insights for future digital health interventions aimed at improving public health behaviors in vulnerable communities.

TL;DR:

This publication details formative research aimed at developing and refining digital oral health messages for a smartphone app called Oralytics, targeting diverse racial and ethnic minority populations to promote daily toothbrushing. The study utilized theoretically grounded strategies such as reciprocity, reciprocity-by-proxy, and curiosity to enhance engagement with oral self-care behaviors, the app, and the messages themselves. Messages were developed using a web-based participatory co-design approach involving dental experts, Amazon Mechanical Turk workers, and dental patients. This approach notably focused on mitigating anonymity and confidentiality concerns during participant feedback sessions, which were conducted via facilitator-mediated Zoom webinars. Participants rated the messages highly, with qualitative feedback emphasizing a preference for messages that were straightforward, enthusiastic, conversational, relatable, and authentic. The research provides insights into designing engaging digital health interventions for underserved populations, stressing the importance of identifying key stimuli, gathering multiple perspectives, and employing innovative, secure data collection methods.

Non-Stationary Latent Auto-Regressive Bandits

Authors:

Anna L. Trella, Walter Dempsey, Asim H. Gazi, Ziping Xu, Finale Doshi-Velez, Susan A. Murphy

Publication Venue:

arXiv: 2402.03110

Publication Date:

February 3, 2024

Keywords:

contextual bandits, bandit algorithms, non-stationarity

View Full Paper

Related Projects:

CP 2

Abstract:

For the non-stationary multi-armed bandit (MAB) problem, many existing methods allow a general mechanism for the non-stationarity, but rely on a budget for the non-stationarity that is sub-linear to the total number of time steps . In many real-world settings, however, the mechanism for the non-stationarity can be modeled, but there is no budget for the non-stationarity. We instead consider the non-stationary bandit problem where the reward means change due to a latent, auto-regressive (AR) state. We develop Latent AR LinUCB (LARL), an online linear contextual bandit algorithm that does not rely on the non-stationary budget, but instead forms good predictions of reward means by implicitly predicting the latent state. The key idea is to reduce the problem to a linear dynamical system which can be solved as a linear contextual bandit. In fact, LARL approximates a steady-state Kalman filter and efficiently learns system parameters online. We provide an interpretable regret bound for LARL with respect to the level of non-stationarity in the environment. LARL achieves sub-linear regret in this setting if the noise variance of the latent state process is sufficiently small with respect to . Empirically, LARL outperforms various baseline methods in this non-stationary bandit problem.

TL;DR:

This paper introduces Latent AR LinUCB (LARL), an efficient online algorithm designed for non-stationary multi-armed bandit (MAB) problems. Unlike many existing methods that rely on a budget for non-stationarity, LARL addresses settings where reward means change due to an unbudgeted, latent, auto-regressive (AR) state. It effectively forms good predictions of reward means by implicitly predicting the latent state. This is achieved by reducing the problem to a linear dynamical system solvable as a linear contextual bandit, which approximates a steady-state Kalman filter and learns parameters online. LARL achieves sub-linear regret under specific conditions related to the latent state noise variance and outperforms various stationary and non-stationary baseline methods in simulations.

A Deployed Online Reinforcement Learning Algorithm in an Oral Health Clinical Trial

Authors:

Anna L. Trella, Kelly W. Zhang, Hinal Jajal, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan Murphy

Publication Venue:

arXiv: 2409.02069

Publication Date:

September 3, 2024

Keywords:

Reinforcement Learning, mHealth, Clinical Trials, Oral Health, Oral Self-Care Behaviors, Thompson Sampling, Contextual Bandits, Deployment, Replicability, Adaptive Interventions

View Full Paper

Related Projects:

Cp 2

Abstract:

Dental disease is a prevalent chronic condition associated with substantial financial burden, personal suffering, and increased risk of systemic diseases. Despite widespread recommendations for twice-daily tooth brushing, adherence to recommended oral self-care behaviors remains sub-optimal due to factors such as forgetfulness and disengagement. To address this, we developed Oralytics, a mHealth intervention system designed to complement clinician-delivered preventative care for marginalized individuals at risk for dental disease. Oralytics incorporates an online reinforcement learning algorithm to determine optimal times to deliver intervention prompts that encourage oral self-care behaviors. We have deployed Oralytics in a registered clinical trial. The deployment required careful design to manage challenges specific to the clinical trials setting in the U.S. In this paper, we (1) highlight key design decisions of the RL algorithm that address these challenges and (2) conduct a re-sampling analysis to evaluate algorithm design decisions. A second phase (randomized control trial) of Oralytics is planned to start in spring 2025.

TL;DR:

This paper introduces Oralytics, an mHealth intervention system that leverages an online reinforcement learning (RL) algorithm to optimize the delivery of engagement prompts aimed at improving oral self-care behaviors (OSCB) in individuals at risk for dental disease. The authors detail the key design decisions for deploying this RL algorithm within a registered clinical trial setting, addressing challenges such as ensuring autonomy and replicability through methods like pre-scheduling actions and implementing fallback procedures, and dealing with limited per-individual data by using a full-pooling approach. Through re-sampling analysis, the study provides evidence that the algorithm successfully learned to identify states where sending prompts was effective or ineffective.

Optimizing an Adaptive Digital Oral Health Intervention for Promoting Oral Self-Care Behaviors: Micro-Randomized Trial Protocol

Authors:

Anna L. Trella, Kelly W. Zhang, Hinal Jajal, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan Murphy

Publication Venue:

arXiv: 2409.02069

Publication Date:

September 3, 2024

Keywords:

Reinforcement Learning, mHealth, Clinical Trials, Oral Health, Oral Self-Care Behaviors, Thompson Sampling, Contextual Bandits, Deployment, Replicability, Adaptive Interventions

View Full Paper

Related Projects:

Cp 2

Abstract:

TL;DR:

Engaging Racial & Ethnic Minorities in Digital Oral Self-Care Interventions: A Formative Research into Messaging Strategies

Authors:

Stephanie Carpenter, Zara Greer, Rebecca Newman, Susan Murphy, Vivek Shetty, Inbal Nahum-Shani

Publication Venue:

JMIR Formative Research

Publication Date:

December 11, 2023

Keywords:

engagement, oral health, mobile health intervention, racial and ethnic minority group, message development

View Full Paper

Related Projects:

Cp 2

Abstract:

Background: The prevention of oral health diseases is a key public health issue and a major challenge for racial and ethnic minority groups, who often face barriers in accessing dental care. Daily toothbrushing is an important self-care behavior necessary for sustaining good oral health, yet engagement in regular brushing remains a challenge. Identifying strategies to promote engagement in regular oral self-care behaviors among populations at risk of poor oral health is critical.

Objective: The formative research described here focused on creating messages for a digital oral self-care intervention targeting a racially and ethnically diverse population. Theoretically grounded strategies (reciprocity, reciprocity-by-proxy, and curiosity) were used to promote engagement in 3 aspects: oral self-care behaviors, an oral care smartphone app, and digital messages. A web-based participatory co-design approach was used to develop messages that are resource efficient, appealing, and novel; this approach involved dental experts, individuals from the general population, and individuals from the target population—dental patients from predominantly low-income racial and ethnic minority groups. Given that many individuals from racially and ethnically diverse populations face anonymity and confidentiality concerns when participating in research, we used an approach to message development that aimed to mitigate these concerns.

Methods: Messages were initially developed with feedback from dental experts and Amazon Mechanical Turk workers. Dental patients were then recruited for 2 facilitator-mediated group webinar sessions held over Zoom (Zoom Video Communications; session 1: n=13; session 2: n=7), in which they provided both quantitative ratings and qualitative feedback on the messages. Participants interacted with the facilitator through Zoom polls and a chat window that was anonymous to other participants. Participants did not directly interact with each other, and the facilitator mediated sessions by verbally asking for message feedback and sharing key suggestions with the group for additional feedback. This approach plausibly enhanced participant anonymity and confidentiality during the sessions.

Results: Participants rated messages highly in terms of liking (overall rating: mean 2.63, SD 0.58; reciprocity: mean 2.65, SD 0.52; reciprocity-by-proxy: mean 2.58, SD 0.53; curiosity involving interactive oral health questions and answers: mean 2.45, SD 0.69; curiosity involving tailored brushing feedback: mean 2.77, SD 0.48) on a scale ranging from 1 (do not like it) to 3 (like it). Qualitative feedback indicated that the participants preferred messages that were straightforward, enthusiastic, conversational, relatable, and authentic.

Conclusions: This formative research has the potential to guide the design of messages for future digital health behavioral interventions targeting individuals from diverse racial and ethnic populations. Insights emphasize the importance of identifying key stimuli and tasks that require engagement, gathering multiple perspectives during message development, and using new approaches for collecting both quantitative and qualitative data while mitigating anonymity and confidentiality concerns.

TL;DR:

The formative research described here focused on creating messages for a digital oral self-care intervention targeting a racially and ethnically diverse population. Theoretically grounded strategies (reciprocity, reciprocity-by-proxy, and curiosity) were used to promote engagement in 3 aspects: oral self-care behaviors, an oral care smartphone app, and digital messages.

TR&D3: Translation

mTeeth: Identifying Brushing Teeth Surfaces Using Wrist-Worn Inertial Sensors

Authors:

Sayma Akther, Nazir Saleheen, Mithun Saha, Vivek Shetty, Santosh Kumar

Publication Venue:

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Keywords:

mHealth, brushing detection, flossing detection, hand-to-mouth gestures

Publication Date:

June 2021

View Full Paper

Related Project:

CP 2

Abstract:

Ensuring that all the teeth surfaces are adequately covered during daily brushing can reduce the risk of several oral diseases. In this paper, we propose the mTeeth model to detect teeth surfaces being brushed with a manual toothbrush in the natural free-living environment using wrist-worn inertial sensors. To unambiguously label sensor data corresponding to different surfaces and capture all transitions that last only milliseconds, we present a lightweight method to detect the micro-event of brushing strokes that cleanly demarcates transitions among brushing surfaces. Using features extracted from brushing strokes, we propose a Bayesian Ensemble method that leverages the natural hierarchy among teeth surfaces and patterns of transition among them. For training and testing, we enrich a publicly-available wrist-worn inertial sensor dataset collected from the natural environment with time-synchronized precise labels of brushing surface timings and moments of transition. We annotate 10,230 instances of brushing on different surfaces from 114 episodes and evaluate the impact of wide between-person and within-person between-episode variability on machine learning model’s performance for brushing surface detection.

TL;DR:

In this paper, we propose the mTeeth model to detect teeth surfaces being brushed with a manual toothbrush in the natural free-living environment using wrist-worn inertial sensors.

CP2, a collaborative project involving industry (P&G, Delta Dental), will remotely monitor Oral Health Behaviors (OHBs) in the home setting and use the insights to develop a computationally-driven, personalized Digital Oral Health Intervention (DOHIs) and test its real-world efficacy in engaging at-risk individuals in ideal OHBs and improving their oral health. CP2 builds on the Remote Oral Behavior Assessment System project (ROBAS; 1R01DE025244; NIH/NIDCR; 7/1/15–5/31/20: PI: Shetty) that provides objective, individual-level and ecologically-valid data on oral hygiene behaviors.

In the preparatory UG3 phase, CP2 will engage end-users in the co-design of an oral self-care app and establish the usability and feasibility of the system. In the UH3 phase, CP2 will build and validate computational models for inferring the quality of OHBs and for tailoring of the DOHI. Using a cohort of 130 subjects, CP2 will conduct a 10-week Micro-Randomized Trial (MRT) to optimize the adaptive tailoring of the engagement strategies for the DOHI. Finally, CP2 will test its hypothesis (that the dynamic and personalized DOHI will be more effective than traditional, static, clinician-delivered OHI in improving oral health and adherence to 2x2x4 OHBs (brush 2 times a day, for 2 minutes each time, all 4 dental quadrants)) through a 6-month, pragmatic, randomized, controlled, parallel-group clinical trial of 260 subjects.

CP2 can fundamentally alter oral healthcare, emphasizing prevention and oral health maintenance. Beyond advancing behavior change theories, linking dental disease to actual OHBs would enhance our understanding of dental disease determinants and establish which digital behavioral intervention may be most effective, when and for whom, providing a springboard to the practice of 21st century temporally-precise dentistry.

CP2 is developing digital biomarkers of brushing from consumer toothbrushes and wearable wrist. To analyze the time series of OHB biomarkers from its studies, CP2 will leverage tools developed by TR&D1 Aim 1 to characterize the uncertainty in its biomarkers. In collaboration with TR&D1 Aim 2, CP2 will develop a dynamic risk indicator for dental disease which will support future intervention development. Successful data collection over six months requires enough engagement so participants can continue to be interested in the study. TR&D1 will work with CP2 to develop an score of engagement and TR&D2 will work with CP2 to identify strategies for improving engagement from the data collected so that future such studies can achieve greater compliance from their participants over long-term. Next, CP2 will utilize modeling tools from TR&D1 Aim 3 to model the dynamic relationships among the sociobehavioral risk factors captured by digital biomarkers of OHBs, eating, stress, activity, location, and mobility. Multiple iterations of model development, evaluation on study data, and refinement with domain experts will ensure that TR&D1 methods advance CP2 research on temporally-precise dentistry.

In push/pull collaboration with TR&D2, CP2 will develop DOHI intervention decision rules based on data from the 10-week Micro-Randomized Trial (MRT). In the RL framework, the actions are engagement strategies and the near time rewards is daily engagement in the 2x2x4 OHBs. The MRT will also provide data for TR&D2 to develop useful simulations for early evaluations of the methods under Aims 1 and 2. TR&D2 will provide a RL personalization algorithm (based on work under Aims 1 and 2) that uses the above decision rules as a warm start to personalize these decision rules as an individual experiences the DOHI intervention. CP2 is interested in deploying the developed RL algorithm in real-time to personalize decision rules for a small additional number of subjects in the 6-month trial; this deployment will provide data on real-time feasibility and acceptability from subjects as they experience the RL algorithm. The 6-month trial will also provide data to build a testbed simulation for use by TR&D2 to conduct early evaluation of the methods under Aim 3.

Currently, OHB biomarkers are being implemented in a cloud platform, after data collection. TR&D3 Aim 1 will work with CP2 to design, develop and validate micromarkers for real-time detection of OHB’s on smartphones from wrist and brush sensors. These new biomarkers will then become easily usable by researchers deploying wrist-worn sensors and in common consumer devices (e.g., Oral-B). In addition, integrating information across the sensors embedded into the toothbrush and wristband will provide unique insight on how the subjects hold and use their toothbrush. For this purpose, CP2 will test and deploy distributed fusion of information from two high rate inertial sensors at high time resolution from Aim 2. The UG3 will provide rapid feedback on the usability and utility of TR&D1 sensors and methods, which will be evaluated in the UH3 phase of CP2.

The mDOT Center

CP 2: Personalized Digital Behavior Change Interventions to Promote Oral Health