CP 1: Novel Use of mHealth Data to Identify States of Vulnerability and Receptivity to JITAIs
CP / Smoking Cessation / TR&D1 / TR&D2 / TR&D3
International Conference on Learning Representations (ICLR)
January 28, 2022
irregular sampling, uncertainty, imputation, interpolation, multivariate time series, missing data, variational autoencoder
In order to model and represent uncertainty in mHealth biomarkers to account for multifaceted uncertainty during momentary decision making in selecting, adapting, and delivering temporally-precise mHealth interventions. In this period, we extended our previous deep learning approach, Multi-Time Attention Networks, to enable improved representation of output uncertainty. Our new approach preserves the idea of learned temporal similarity functions and adds heteroskedastic output uncertainty. The new framework is referred to as the Heteroskedastic Variational Autoencoder and models real-valued multivariate data.
Irregularly sampled time series commonly occur in several domains where they present a significant challenge to standard deep learning models. In this paper, we propose a new deep learning framework for probabilistic interpolation of irregularly sampled time series that we call the Heteroscedastic Temporal Variational Autoencoder (HeTVAE). HeTVAE includes a novel input layer to encode information about input observation sparsity, a temporal VAE architecture to propagate uncertainty due to input sparsity, and a heteroscedastic output layer to enable variable uncertainty in output interpolations. Our results show that the proposed architecture is better able to reflect variable uncertainty through time due to sparse and irregular sampling than a range of baseline and traditional models, as well as recently proposed deep latent variable models that use homoscedastic output layers.
We present a new deep learning architecture for probabilistic interpolation of irregularly sampled time series.
IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
September 12, 2022
Bayesian inference, probabilistic programming, time series, missing data, Bayesian imputation, mobile health
We have developed a toolbox for the specification and estimation of mechanistic models in the dynamic bayesian network family. This toolbox focuses on making it easier to specify probabilistic dynamical models for time series data and to perform Bayesian inference and imputation in the specified model given incomplete data as input. The toolbox is referred to as BayesLDM. We have been working with members of CP3, CP4, and TR&D2 to develop offline data analysis and simulation models using this toolbox. We are also currently in discussions with members of CP4 to deploy the toolbox’s Bayesian imputation methods within a live controller optimization trial in the context of an adaptive walking intervention.
In this paper we present BayesLDM, a system for Bayesian longitudinal data modeling consisting of a high-level modeling language with specific features for modeling complex multivariate time series data coupled with a compiler that can produce optimized probabilistic program code for performing inference in the specified model. BayesLDM supports modeling of Bayesian network models with a specific focus on the efficient, declarative specification of dynamic Bayesian Networks (DBNs). The BayesLDM compiler combines a model specification with inspection of available data and outputs code for performing Bayesian inference for unknown model parameters while simultaneously handling missing data. These capabilities have the potential to significantly accelerate iterative modeling workflows in domains that involve the analysis of complex longitudinal data by abstracting away the process of producing computationally efficient probabilistic inference code. We describe the BayesLDM system components, evaluate the efficiency of representation and inference optimizations and provide an illustrative example of the application of the system to analyzing heterogeneous and partially observed mobile health data.
We present a a toolbox for the specification and estimation of mechanistic models in the dynamic bayesian network family.
Neural Information Processing Systems (NeurIPS), Track on Datasets and Benchmarks
September 16, 2022
missingness, imputation, mHealth, sensors, time-series, self-attention, pulsative, physiological, dataset
We developed a state-of-the-art attention-based deep learning transformer architecture that can learn to leverage the quasi-periodic signal structure to perform accurate imputation in the face of substantial amounts of missingness, such as the absence of multiple beats. We have validated that this novel transformer-based imputation method outperforms existing standard imputation baselines.
The promise of Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions. However, a major challenge is frequent missing data. Despite a rich imputation literature, existing techniques are ineffective for the pulsative signals which comprise many mHealth applications, and a lack of available datasets has stymied progress. We address this gap with PulseImpute, the first large-scale pulsative signal imputation challenge which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks. Our baseline models include a novel transformer-based architecture designed to exploit the structure of pulsative signals. We hope that PulseImpute will enable the ML community to tackle this important and challenging task.
PulseImpute is the first mHealth pulsative signal imputation challenge which includes realistic missingness models, clinical downstream tasks, and an extensive set of baselines, including an augmented transformer that achieves SOTA performance.
Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, Proceedings, Part VIII. Pages 78-93
February 12, 2023
transformer, uncertainty quantification, mixer, query-based obeject detection, deep ensembles
Recently, a new paradigm of query-based object detection has gained popularity. In this paper, we study the problem of quantifying the uncertainty in the predictions of these models that derive from model uncertainty. Such uncertainty quantification is vital for many high-stakes applications that need to avoid making overconfident errors. We focus on quantifying multiple aspects of detection uncertainty based on a deep ensembles representation. We perform extensive experiments on two representative models in this space: DETR and AdaMixer. We show that deep ensembles of these query-based detectors result in improved performance with respect to three types of uncertainty: location uncertainty, class uncertainty, and objectness uncertainty
This paper explores uncertainty in query-based object detection models, crucial for high-stakes applications to prevent overconfident errors. The authors concentrate on quantifying uncertainty in detection using deep ensembles, conducting experiments on DETR and AdaMixer models. They show that deep ensembles enhance performance in location, class, and objectness uncertainties.
Conference on Uncertainty in Artificial Intelligence (UAI 2023)
May 17, 2023
reinforcement learning, partial observability, context inference, adaptive interventions, empirical evaluation, mobile health
Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual’s time varying state. In this work, we explore the application of reinforcement learning methods to the problem of learning intervention option selection policies. We study the effect of context inference error and partial observability on the ability to learn effective policies. Our results show that the propagation of uncertainty from context inferences is critical to improving intervention efficacy as context uncertainty increases, while policy gradient algorithms can provide remarkable robustness to partially observed behavioral state information.
This work focuses on JITAIs, personalized health interventions that dynamically select support components based on an individual’s changing state. The study applies reinforcement learning methods to learn policies for selecting intervention options, revealing that uncertainty from context inferences is crucial for enhancing intervention efficacy as context uncertainty increases.
ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT)
September 7, 2022
behavioral intervention, human-centered computing, risk prediction, smoking cessation, ubiquitous and mobile computing design and evaluation methods, wearable sensors
Passive detection of risk factors (that may influence unhealthy or adverse behaviors) via wearable and mobile sensors has created new opportunities to improve the effectiveness of behavioral interventions. A key goal is to find opportune moments for intervention by passively detecting rising risk of an imminent adverse behavior. But, it has been difficult due to substantial noise in the data collected by sensors in the natural environment and a lack of reliable label assignment of low- and high-risk states to the continuous stream of sensor data. In this paper, we propose an event-based encoding of sensor data to reduce the effect of noises and then present an approach to efficiently model the historical influence of recent and past sensor-derived contexts on the likelihood of an adverse behavior. Next, to circumvent the lack of any confirmed negative labels (i.e., time periods with no high-risk moment), and only a few positive labels (i.e., detected adverse behavior), we propose a new loss function. We use 1,012 days of sensor and self-report data collected from 92 participants in a smoking cessation field study to train deep learning models to produce a continuous risk estimate for the likelihood of an impending smoking lapse. The risk dynamics produced by the model show that risk peaks an average of 44 minutes before a lapse. Simulations on field study data show that using our model can create intervention opportunities for 85% of lapses with 5.5 interventions per day.
We present a model for identifying ideal moments for intervention by passively detecting risk of an imminent adverse behavior.
Neural Information Processing Systems (NeurIPS)
October 31, 2022
attention, continuous attention, kernel methods
One technical challenge in modeling missingness in biomarker streams is the need to develop flexible attention mechanisms that can learn to focus on the relevant aspects of an input signal. We have completed the development of a novel continuous-time attention model which is capable of learning multimodal densities, meaning that the attention density can be focused on multiple signal regions simultaneously. Classical solutions like Gaussian mixtures have dense support, with the result that all regions of a signal have some probability mass, making it difficult to focus the attention on key regions and ignore irrelevant ones. Our work introduces kernel deformed exponential families, a sparse class of multimodal attention densities.
We theoretically analysed the normalization, approximation, and numerical integration properties of this density class. We applied these densities in analyzing real-world time series data and showed that the densities often capture the most salient aspects of an input signal, and outperform baseline density models on a diverse set of tasks.
Attention mechanisms take an expectation of a data representation with respect to probability weights. Recently, (Martins et al. 2020, 2021) proposed continuous attention mechanisms, focusing on unimodal attention densities from the exponential and deformed exponential families: the latter has sparse support. (Farinhas et al 2021) extended this to to multimodality via Gaussian mixture attention densities. In this paper, we extend this to kernel exponential families (Canu and Smola 2006) and our new sparse counterpart, kernel deformed exponential families. Theoretically, we show new existence results for both kernel exponential and deformed exponential families, and that the deformed case has similar approximation capabilities to kernel exponential families. Lacking closed form expressions for the context vector, we use numerical integration: we show exponential convergence for both kernel exponential and deformed exponential families. Experiments show that kernel continuous attention often outperforms unimodal continuous attention, and the sparse variant tends to highlight peaks of time series.
We extend continuous attention from unimodal (deformed) exponential families and Gaussian mixture models to kernel exponential families and a new kernel deformed sparse counterpart.
Sy-Miin Chow, Inbal Nahum-Shani, Justin Baker, Donna Spruijt-Metz, Nicholas Allen, Randy Auerbach, Genevieve Dunton, Naomi Friedman, Stephen Intille, Predrag Klasnja, Benjamin Marlin, Matthew Nock, Scott Rauch, Misha Pavel, Scott Vrieze, David Wetter, Evan Kleiman, Timothy Brick, Heather Perry, Dana Wolff-Hughes
Translational Behavioral Medicine, Volume 13, Issue 1, January 2023, Pages 7–16
November 23, 2022
EMA, health behavior changes, ILHBN, location, sensor
The ILHBN is funded by the National Institutes of Health to collaboratively study the interactive dynamics of behavior, health, and the environment using Intensive Longitudinal Data (ILD) to (a) understand and intervene on behavior and health and (b) develop new analytic methods to innovate behavioral theories and interventions. The heterogenous study designs, populations, and measurement protocols adopted by the seven studies within the ILHBN created practical challenges, but also unprecedented opportunities to capitalize on data harmonization to provide comparable views of data from different studies, enhance the quality and utility of expensive and hard-won ILD, and amplify scientific yield. The purpose of this article is to provide a brief report of the challenges, opportunities, and solutions from some of the ILHBN’s cross-study data harmonization efforts. We review the process through which harmonization challenges and opportunities motivated the development of tools and collection of metadata within the ILHBN. A variety of strategies have been adopted within the ILHBN to facilitate harmonization of ecological momentary assessment, location, accelerometer, and participant engagement data while preserving theory-driven heterogeneity and data privacy considerations. Several tools have been developed by the ILHBN to resolve challenges in integrating ILD across multiple data streams and time scales both within and across studies. Harmonization of distinct longitudinal measures, measurement tools, and sampling rates across studies is challenging, but also opens up new opportunities to address cross-cutting scientific themes of interest.
The article shares insights, challenges, opportunities, and solutions from harmonizing intensive longitudinal data within the ILHBN, providing tools and recommendations for future data harmonization efforts.
Addictive Behaviors, Volume 136, p.107467
January 2023
just-in-time adaptive intervention, micro-randomized trial, mindfulness; smoking cessation; mHealth
Smoking cessation treatments that are easily accessible and deliver intervention content at vulnerable moments (e.g., high negative affect) have great potential to impact tobacco abstinence. The current study examined the feasibility and acceptability of a multi-component Just-In-Time Adaptive Intervention (JITAI) for smoking cessation. Daily smokers interested in quitting were consented to participate in a 6-week cessation study. Visit 1 occurred 4 days pre-quit, Visit 2 was on the quit day, Visit 3 occurred 3 days post-quit, Visit 4 was 10 days post-quit, and Visit 5 was 28 days post-quit. During the first 2 weeks (Visits 1-4), the JITAI delivered brief mindfulness/motivational strategies via smartphone in real-time based on negative affect or smoking behavior detected by wearable sensors. Participants also attended 5 in-person visits, where brief cessation counseling (Visits 1-4) and nicotine replacement therapy (Visits 2-5) were provided. Outcomes were feasibility and acceptability; biochemically-confirmed abstinence was also measured. Participants (N = 43) were 58.1 % female (AgeMean = 49.1, mean cigarettes per day = 15.4). Retention through follow-up was high (83.7 %). For participants with available data (n = 38), 24 (63 %) met the benchmark for sensor wearing, among whom 16 (67 %) completed at least 60 % of strategies. Perceived ease of wearing sensors (Mean = 5.1 out of 6) and treatment satisfaction (Mean = 3.6 out of 4) were high. Biochemically-confirmed abstinence was 34 % at Visit 4 and 21 % at Visit 5. Overall, the feasibility of this novel multi-component intervention for smoking cessation was mixed but acceptability was high. Future studies with improved technology will decrease participant burden and better detect key intervention moments.
The study assessed the feasibility and acceptability of a multi-component Just-In-Time Adaptive Intervention (JITAI) for smoking cessation, utilizing smartphone-delivered mindfulness/motivational strategies based on real-time negative affect or smoking behavior detected by wearable sensors. Participants showed high retention (83.7%) and reported high satisfaction with the intervention, but the feasibility was mixed.
ACM CHI 2024 – Under Review
Under Review
momentary stress, stressors, reflective visualizations, stressor logging, stress management, wearables
Our goal in Aim 3 is to understand the dynamic relationships between personalized drivers of momentary risk and disease progression to identify targets of temporally precise interventions. This year, we completed the MOODS study with 122 participants who wore a study-provided Fossil Sport smartwatch with our MOODS app, installed our MOODS app on their personal smartphones, and used both apps for 100 days. They rated their stress 3-4 times daily and described the stressor for events they rated as stressful. They received new visualizations of their data each week. We analyzed the impact of the study on self-reported stress ratings and the diversity in stressors reported by the participants.
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
mobile health, context, smoking cessation, intervention, GPS traces
March 2020
Context plays a key role in impulsive adverse behaviors such as fights, suicide attempts, binge-drinking, and smoking lapse. Several contexts dissuade such behaviors, but some may trigger adverse impulsive behaviors. We define these latter contexts as ‘opportunity’ contexts, as their passive detection from sensors can be used to deliver context-sensitive interventions. In this paper, we define the general concept of ‘opportunity’ contexts and apply it to the case of smoking cessation. We operationalize the smoking ‘opportunity’ context, using self-reported smoking allowance and cigarette availability. We show its clinical utility by establishing its association with smoking occurrences using Granger causality. Next, we mine several informative features from GPS traces, including the novel location context of smoking spots, to develop the SmokingOpp model for automatically detecting the smoking ‘opportunity’ context. Finally, we train and evaluate the SmokingOpp model using 15 million GPS points and 3,432 self-reports from 90 newly abstinent smokers in a smoking cessation study.
In this paper, we define the general concept of ‘opportunity’ contexts and apply it to the case of smoking cessation. We mine several informative features from GPS traces, including the novel location context of smoking spots, to develop the SmokingOpp model for automatically detecting the smoking ‘opportunity’ context.
Advances in Neural Information Processing Systems
irregular sampling, multivariate time series, missing data, discretization, interpolation, recurrence, attention
January 5, 2021
In this survey, we first describe several axes along which approaches to learning from irregularly sampled time series differ including what data representations they are based on, what modeling primitives they leverage to deal with the fundamental problem of irregular sampling, and what inference tasks they are designed to perform. We then survey the recent literature organized primarily along the axis of modeling primitives.
Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series
https://github.com/reml-lab/hetvae
12 forks.
30 stars.
3 open issues.
Recent commits:
International Conference on Learning Representations (ICLR)
January 28, 2022
Jupyter Notebook
Python
BayesLDM
https://github.com/reml-lab/BayesLDM
1 forks.
0 stars.
0 open issues.
Recent commits:
IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
September 12, 2022
Code for PulseImpute Challenge
https://github.com/rehg-lab/pulseimpute
1 forks.
26 stars.
0 open issues.
Recent commits:
Neural Information Processing Systems (NeurIPS)
September 16, 2022
Code for ECCV 2022 UnCV Workshop paper
https://github.com/colinski/uq-query-object-detectors
0 forks.
0 stars.
0 open issues.
Recent commits:
Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, Proceedings, Part VIII. Pages 78-93
February 12, 2023
Markdown Documentation
RL for JITAI optimization using simulated environments.
https://github.com/reml-lab/rl_jitai_simulation
0 forks.
0 stars.
0 open issues.
Recent commits:
Conference on Uncertainty in Artificial Intelligence (UAI 2023)
May 17, 2023
Predicting Smoking Lapse Risk from Mobile Sensor Datastreams
https://github.com/aungkonazim/mrisk
0 forks.
0 stars.
0 open issues.
Recent commits:
ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT)
September 7, 2022
Jupyter Notebook
Python
Code for Kernel Multimodal Continuous Attention: to be released by Nov 20.
https://github.com/onenoc/kernel-continuous-attention
1 forks.
3 stars.
0 open issues.
Recent commits:
Neural Information Processing Systems (NeurIPS)
October 31, 2022
Jupyter Notebook
Python
International Conference on Learning Representations (ICLR)
January 28, 2022
The novel PulseImpute dataset is the first large-scale dataset containing complex imputation tasks for pulsative biophysical signals. State-of-the-art imputation methods from the time series literature are shown to exhibit poor performance on PulseImpute, demonstrating that the missingness patterns emerging in mHealth applications represent a unique and important class of imputation problems. By releasing this dataset and a new state-of-the-art baseline algorithm, we hope to spur the ML community to begin addressing these challenging problems.
Neural Information Processing Systems (NeurIPS)
November 28, 2022
The past decade has seen tremendous advances in the ability to compute a diverse array of mobile sensor-based biomarkers in order to passively estimate health states, activities, and associated contexts (e.g. physical activity, sleep, smoking, mood, craving, stress, and geospatial context). Researchers are now engaged in the conduct of both observational and interventional field studies of increasing complexity and length that leverage mHealth sensor and biomarker technologies combined with the collection of measures of disease progression and other outcomes.
As a result of the expansion of the set of available mHealth biomarkers and the push toward long-term, real-world deployment of mHealth technologies, a new set of critical gaps has emerged that were previously obscured by the focus of the field on smaller-scale proof-of-concept studies and the investigation of single biomarkers in isolation.
First, the issue of missing sensor and biomarker data in mHealth field studies has quickly become a critical problem that directly and significantly impacts many of our CPs. Issues including intermittent wireless dropouts, wearables and smartphones running out of battery power, participants forgetting to carry or wear devices, and participants exercising privacy controls can all contribute to complex patterns of missing data that significantly complicate data analysis and limit the effectiveness of sensor-informed mHealth interventions.
Second, with increasing interest in the use of reinforcement learning methods to provide online adaptation of interventions for every individual, there is an urgent need for high-quality, compact and interpretable feature representations that can enable more effective learning under strict budgets on the number of interactions with patients.
Finally, as in other areas that are leveraging machine learning methods to drive scientific discovery and support decision making, mHealth needs methods that can be used to derive high-level knowledge and support causal hypothesis generation based on complex, non-linear models fit to biomarker time series data.
Assistant Professor
Applied Scientist
Principal Algorithms Engineer
Senior Research Scientist
Applied Scientist II
Applied Scientist
Machine Learning Scientist
Deputy Center Director, TR&D1 Lead
Lead PI, Center Director, TR&D1, TR&D2, TR&D3
Co-Investigator, TR&D1, TR&D2
Doctoral Student
Doctoral Student
Doctoral Student
TR&D1’s technologies are making a significant impact by advancing the fundamental understanding of health and behavior by supporting the analysis of complex, longitudinal, mHealth data.