The mDOT Center

Transforming health and wellness via temporally-precise mHealth interventions
mDOT@MD2K.org
901.678.1526
 

TR&D1: Discovery

mDOT Center > Research Projects > TR&D1: Discovery

Enabling the Discovery of Temporally-Precise Intervention Targets and Timing Triggers from mHealth Biomarkers via Uncertainty-Aware Modeling of Personalized Risk Dynamics

Heteroscedastic Temporal Variational Autoencoder for Irregularly Sampled Time Series
Authors:
Publication Venue:

International Conference on Learning Representations (ICLR)

Publication Date:

January 28, 2022

Keywords:

irregular sampling, uncertainty, imputation, interpolation, multivariate time series, missing data, variational autoencoder

Related Project:

In order to model and represent uncertainty in mHealth biomarkers to account for multifaceted uncertainty during momentary decision making in selecting, adapting, and delivering temporally-precise mHealth interventions.  In this period, we extended our previous deep learning approach, Multi-Time Attention Networks, to enable improved representation of output uncertainty.  Our new approach preserves the idea of learned temporal similarity functions and adds heteroskedastic output uncertainty.  The new framework is referred to as the Heteroskedastic Variational Autoencoder and models real-valued multivariate data.

Abstract:

Irregularly sampled time series commonly occur in several domains where they present a significant challenge to standard deep learning models. In this paper, we propose a new deep learning framework for probabilistic interpolation of irregularly sampled time series that we call the Heteroscedastic Temporal Variational Autoencoder (HeTVAE). HeTVAE includes a novel input layer to encode information about input observation sparsity, a temporal VAE architecture to propagate uncertainty due to input sparsity, and a heteroscedastic output layer to enable variable uncertainty in output interpolations. Our results show that the proposed architecture is better able to reflect variable uncertainty through time due to sparse and irregular sampling than a range of baseline and traditional models, as well as recently proposed deep latent variable models that use homoscedastic output layers.

TL;DR:

We present a new deep learning architecture for probabilistic interpolation of irregularly sampled time series.

BayesLDM: A Domain-Specific Language for Probabilistic Modeling of Longitudinal Data
Authors:
Publication Venue:

IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)

Publication Date:

September 12, 2022

Keywords:

Bayesian inference, probabilistic programming, time series, missing data, Bayesian imputation, mobile health

Related Projects:

We have developed a toolbox for the specification and estimation of mechanistic models in the dynamic bayesian network family.  This toolbox focuses on making it easier to specify probabilistic dynamical models for time series data and to perform Bayesian inference and imputation in the specified model given incomplete data as input.  The toolbox is referred to as BayesLDM.  We have been working with members of CP3, CP4, and TR&D2 to develop offline data analysis and simulation models using this toolbox.  We are also currently in discussions with members of CP4 to deploy the toolbox’s Bayesian imputation methods within a live controller optimization trial in the context of an adaptive walking intervention.

Abstract:

In this paper we present BayesLDM, a system for Bayesian longitudinal data modeling consisting of a high-level modeling language with specific features for modeling complex multivariate time series data coupled with a compiler that can produce optimized probabilistic program code for performing inference in the specified model. BayesLDM supports modeling of Bayesian network models with a specific focus on the efficient, declarative specification of dynamic Bayesian Networks (DBNs). The BayesLDM compiler combines a model specification with inspection of available data and outputs code for performing Bayesian inference for unknown model parameters while simultaneously handling missing data. These capabilities have the potential to significantly accelerate iterative modeling workflows in domains that involve the analysis of complex longitudinal data by abstracting away the process of producing computationally efficient probabilistic inference code. We describe the BayesLDM system components, evaluate the efficiency of representation and inference optimizations and provide an illustrative example of the application of the system to analyzing heterogeneous and partially observed mobile health data.

TL;DR:

We present a a toolbox for the specification and estimation of mechanistic models in the dynamic bayesian network family.

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation
Authors:
Publication Venue:

Neural Information Processing Systems (NeurIPS), Track on Datasets and Benchmarks

Publication Date:

September 16, 2022

Keywords:

missingness, imputation, mHealth, sensors, time-series, self-attention, pulsative, physiological, dataset

We developed a state-of-the-art attention-based deep learning transformer architecture that can learn to leverage the quasi-periodic signal structure to perform accurate imputation in the face of substantial amounts of missingness, such as the absence of multiple beats.  We have validated that this novel transformer-based imputation method outperforms existing standard imputation baselines.

Abstract:

The promise of Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions. However, a major challenge is frequent missing data. Despite a rich imputation literature, existing techniques are ineffective for the pulsative signals which comprise many mHealth applications, and a lack of available datasets has stymied progress. We address this gap with PulseImpute, the first large-scale pulsative signal imputation challenge which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks. Our baseline models include a novel transformer-based architecture designed to exploit the structure of pulsative signals. We hope that PulseImpute will enable the ML community to tackle this important and challenging task.

TL;DR:

PulseImpute is the first mHealth pulsative signal imputation challenge which includes realistic missingness models, clinical downstream tasks, and an extensive set of baselines, including an augmented transformer that achieves SOTA performance.

Uncertainty Quantification Using Query-Based Object Detectors
Authors:
Publication Venue:

Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, Proceedings, Part VIII. Pages 78-93

Publication Date:

February 12, 2023

Keywords:

transformer, uncertainty quantification, mixer, query-based obeject detection, deep ensembles

Abstract:

Recently, a new paradigm of query-based object detection has gained popularity. In this paper, we study the problem of quantifying the uncertainty in the predictions of these models that derive from model uncertainty. Such uncertainty quantification is vital for many high-stakes applications that need to avoid making overconfident errors. We focus on quantifying multiple aspects of detection uncertainty based on a deep ensembles representation. We perform extensive experiments on two representative models in this space: DETR and AdaMixer. We show that deep ensembles of these query-based detectors result in improved performance with respect to three types of uncertainty: location uncertainty, class uncertainty, and objectness uncertainty 

TL;DR:

This paper explores uncertainty in query-based object detection models, crucial for high-stakes applications to prevent overconfident errors. The authors concentrate on quantifying uncertainty in detection using deep ensembles, conducting experiments on DETR and AdaMixer models. They show that deep ensembles enhance performance in location, class, and objectness uncertainties.

Assessing the Impact of Context Inference Error & Partial Observability on RL Methods for Just-In-Time Adaptive Interventions
Authors:
Publication Venue:

Conference on Uncertainty in Artificial Intelligence (UAI 2023)

Publication Date:

May 17, 2023

Keywords:

reinforcement learning, partial observability, context inference, adaptive interventions, empirical evaluation, mobile health

Related Project:
Abstract:

Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual’s time varying state. In this work, we explore the application of reinforcement learning methods to the problem of learning intervention option selection policies. We study the effect of context inference error and partial observability on the ability to learn effective policies. Our results show that the propagation of uncertainty from context inferences is critical to improving intervention efficacy as context uncertainty increases, while policy gradient algorithms can provide remarkable robustness to partially observed behavioral state information.

TL;DR:

This work focuses on JITAIs, personalized health interventions that dynamically select support components based on an individual’s changing state. The study applies reinforcement learning methods to learn policies for selecting intervention options, revealing that uncertainty from context inferences is crucial for enhancing intervention efficacy as context uncertainty increases.

mRisk: Continuous Risk Estimation for Smoking Lapse from Noisy Sensor Data with Incomplete and Positive-Only Labels
Authors:
Publication Venue:

ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT)

Publication Date:

September 7, 2022

Keywords:

behavioral intervention, human-centered computing, risk prediction, smoking cessation, ubiquitous and mobile computing design and evaluation methods, wearable sensors

Related Projects:
Estimation of the continuous risk state may be critical for delivering temporally-precise interventions and treatment adaptations in cessation programs. Continuous sensor data collected from wearables and smartphones to capture risk factors of adverse behaviors in the natural environment are usually noisy and incomplete. For adverse behavioral events such as a smoking lapse, capturing the precise timing of each smoking lapse may not be feasible, as sensors may not be worn at the time of a lapse or the lapse events may not be accurately detected due to the imperfection of machine learning models that are used to detect smoking events via hand-to-mouth gestures.  Therefore, only a few positive events (i.e., smoking lapse in a cessation attempt) are available. Confirmed negative labels can be assigned to a block of sensor data corresponding to a prediction window only if the entire time period is confirmed to have no high-risk moment.  As not all high-risk moments may result in a lapse, labeling a block of sensor data to the negative class is difficult for such events.  We addressed each of these challenges in developing the mRisk model.  Specifically, we encoded sensor data as events to handle noise and missingness, modeled the historical influence of recent psychological, behavioral, and environmental events via deep learning model and addressed the issue of lack of negative labels and only a small subset of positive labels by using a positive-unlabeled framework with a novel loss function.
Abstract:

Passive detection of risk factors (that may influence unhealthy or adverse behaviors) via wearable and mobile sensors has created new opportunities to improve the effectiveness of behavioral interventions. A key goal is to find opportune moments for intervention by passively detecting rising risk of an imminent adverse behavior. But, it has been difficult due to substantial noise in the data collected by sensors in the natural environment and a lack of reliable label assignment of low- and high-risk states to the continuous stream of sensor data. In this paper, we propose an event-based encoding of sensor data to reduce the effect of noises and then present an approach to efficiently model the historical influence of recent and past sensor-derived contexts on the likelihood of an adverse behavior. Next, to circumvent the lack of any confirmed negative labels (i.e., time periods with no high-risk moment), and only a few positive labels (i.e., detected adverse behavior), we propose a new loss function. We use 1,012 days of sensor and self-report data collected from 92 participants in a smoking cessation field study to train deep learning models to produce a continuous risk estimate for the likelihood of an impending smoking lapse. The risk dynamics produced by the model show that risk peaks an average of 44 minutes before a lapse. Simulations on field study data show that using our model can create intervention opportunities for 85% of lapses with 5.5 interventions per day.

TL;DR:

We present a model for identifying ideal moments for intervention by passively detecting risk of an imminent adverse behavior.

Kernel Multimodal Continuous Attention
Authors:
Publication Venue:

Neural Information Processing Systems (NeurIPS)

Publication Date:

October 31, 2022

Keywords:

attention, continuous attention, kernel methods

One technical challenge in modeling missingness in biomarker streams is the need to develop flexible attention mechanisms that can learn to focus on the relevant aspects of an input signal.  We have completed the development of a novel continuous-time attention model which is capable of learning multimodal densities, meaning that the attention density can be focused on multiple signal regions simultaneously.  Classical solutions like Gaussian mixtures have dense support, with the result that all regions of a signal have some probability mass, making it difficult to focus the attention on key regions and ignore irrelevant ones.  Our work introduces kernel deformed exponential families, a sparse class of multimodal attention densities.

We theoretically analysed the normalization, approximation, and numerical integration properties of this density class.  We applied these densities in analyzing real-world time series data and showed that the densities often capture the most salient aspects of an input signal, and outperform baseline density models on a diverse set of tasks.

Abstract:

Attention mechanisms take an expectation of a data representation with respect to probability weights. Recently, (Martins et al. 2020, 2021) proposed continuous attention mechanisms, focusing on unimodal attention densities from the exponential and deformed exponential families: the latter has sparse support. (Farinhas et al 2021) extended this to to multimodality via Gaussian mixture attention densities. In this paper, we extend this to kernel exponential families (Canu and Smola 2006) and our new sparse counterpart, kernel deformed exponential families. Theoretically, we show new existence results for both kernel exponential and deformed exponential families, and that the deformed case has similar approximation capabilities to kernel exponential families. Lacking closed form expressions for the context vector, we use numerical integration: we show exponential convergence for both kernel exponential and deformed exponential families. Experiments show that kernel continuous attention often outperforms unimodal continuous attention, and the sparse variant tends to highlight peaks of time series.

TL;DR:

We extend continuous attention from unimodal (deformed) exponential families and Gaussian mixture models to kernel exponential families and a new kernel deformed sparse counterpart.

The ILHBN: Challenges, Opportunities, & Solutions from Harmonizing Data under Heterogeneous Study Designs, Target Populations, & Measurement Protocols
Authors:
Publication Venue:

Translational Behavioral Medicine, Volume 13, Issue 1, January 2023, Pages 7–16

Publication Date:

November 23, 2022

Keywords:

EMA, health behavior changes, ILHBN, location, sensor

Abstract:

The ILHBN is funded by the National Institutes of Health to collaboratively study the interactive dynamics of behavior, health, and the environment using Intensive Longitudinal Data (ILD) to (a) understand and intervene on behavior and health and (b) develop new analytic methods to innovate behavioral theories and interventions. The heterogenous study designs, populations, and measurement protocols adopted by the seven studies within the ILHBN created practical challenges, but also unprecedented opportunities to capitalize on data harmonization to provide comparable views of data from different studies, enhance the quality and utility of expensive and hard-won ILD, and amplify scientific yield. The purpose of this article is to provide a brief report of the challenges, opportunities, and solutions from some of the ILHBN’s cross-study data harmonization efforts. We review the process through which harmonization challenges and opportunities motivated the development of tools and collection of metadata within the ILHBN. A variety of strategies have been adopted within the ILHBN to facilitate harmonization of ecological momentary assessment, location, accelerometer, and participant engagement data while preserving theory-driven heterogeneity and data privacy considerations. Several tools have been developed by the ILHBN to resolve challenges in integrating ILD across multiple data streams and time scales both within and across studies. Harmonization of distinct longitudinal measures, measurement tools, and sampling rates across studies is challenging, but also opens up new opportunities to address cross-cutting scientific themes of interest.

TL;DR:

The article shares insights, challenges, opportunities, and solutions from harmonizing intensive longitudinal data within the ILHBN, providing tools and recommendations for future data harmonization efforts.

A Just-In-Time Adaptive intervention (JITAI) for Smoking Cessation: Feasibility & Acceptability Findings
Authors:
Publication Venue:

Addictive Behaviors, Volume 136, p.107467

Publication Date:

January 2023

Keywords:

just-in-time adaptive intervention, micro-randomized trial, mindfulness; smoking cessation; mHealth

Abstract:

Smoking cessation treatments that are easily accessible and deliver intervention content at vulnerable moments (e.g., high negative affect) have great potential to impact tobacco abstinence. The current study examined the feasibility and acceptability of a multi-component Just-In-Time Adaptive Intervention (JITAI) for smoking cessation. Daily smokers interested in quitting were consented to participate in a 6-week cessation study. Visit 1 occurred 4 days pre-quit, Visit 2 was on the quit day, Visit 3 occurred 3 days post-quit, Visit 4 was 10 days post-quit, and Visit 5 was 28 days post-quit. During the first 2 weeks (Visits 1-4), the JITAI delivered brief mindfulness/motivational strategies via smartphone in real-time based on negative affect or smoking behavior detected by wearable sensors. Participants also attended 5 in-person visits, where brief cessation counseling (Visits 1-4) and nicotine replacement therapy (Visits 2-5) were provided. Outcomes were feasibility and acceptability; biochemically-confirmed abstinence was also measured. Participants (N = 43) were 58.1 % female (AgeMean = 49.1, mean cigarettes per day = 15.4). Retention through follow-up was high (83.7 %). For participants with available data (n = 38), 24 (63 %) met the benchmark for sensor wearing, among whom 16 (67 %) completed at least 60 % of strategies. Perceived ease of wearing sensors (Mean = 5.1 out of 6) and treatment satisfaction (Mean = 3.6 out of 4) were high. Biochemically-confirmed abstinence was 34 % at Visit 4 and 21 % at Visit 5. Overall, the feasibility of this novel multi-component intervention for smoking cessation was mixed but acceptability was high. Future studies with improved technology will decrease participant burden and better detect key intervention moments.

TL;DR:

The study assessed the feasibility and acceptability of a multi-component Just-In-Time Adaptive Intervention (JITAI) for smoking cessation, utilizing smartphone-delivered mindfulness/motivational strategies based on real-time negative affect or smoking behavior detected by wearable sensors. Participants showed high retention (83.7%) and reported high satisfaction with the intervention, but the feasibility was mixed. 

Momentary Stressor Logging and Reflective Visualizations: Implications for Stress Management with Wearables (Under Review)
Authors:
Publication Venue:

ACM CHI 2024 – Under Review

Publication Date:

Under Review

Keywords:

momentary stress, stressors, reflective visualizations, stressor logging, stress management, wearables

Our goal in Aim 3 is to understand the dynamic relationships between personalized drivers of momentary risk and disease progression to identify targets of temporally precise interventions. This year, we completed the MOODS study with 122 participants who wore a study-provided Fossil Sport smartwatch with our MOODS app, installed our MOODS app on their personal smartphones, and used both apps for 100 days. They rated their stress 3-4 times daily and described the stressor for events they rated as stressful. They received new visualizations of their data each week. We analyzed the impact of the study on self-reported stress ratings and the diversity in stressors reported by the participants.

SmokingOpp: Detecting the Smoking “Opportunity” Context Using Mobile Sensors
Authors:
Publication Venue:

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Keywords:

mobile health, context, smoking cessation, intervention, GPS traces

Publication Date:

March 2020

Related Projects:
Abstract:

Context plays a key role in impulsive adverse behaviors such as fights, suicide attempts, binge-drinking, and smoking lapse. Several contexts dissuade such behaviors, but some may trigger adverse impulsive behaviors. We define these latter contexts as ‘opportunity’ contexts, as their passive detection from sensors can be used to deliver context-sensitive interventions. In this paper, we define the general concept of ‘opportunity’ contexts and apply it to the case of smoking cessation. We operationalize the smoking ‘opportunity’ context, using self-reported smoking allowance and cigarette availability. We show its clinical utility by establishing its association with smoking occurrences using Granger causality. Next, we mine several informative features from GPS traces, including the novel location context of smoking spots, to develop the SmokingOpp model for automatically detecting the smoking ‘opportunity’ context. Finally, we train and evaluate the SmokingOpp model using 15 million GPS points and 3,432 self-reports from 90 newly abstinent smokers in a smoking cessation study.

TL;DR:

In this paper, we define the general concept of ‘opportunity’ contexts and apply it to the case of smoking cessation. We mine several informative features from GPS traces, including the novel location context of smoking spots, to develop the SmokingOpp model for automatically detecting the smoking ‘opportunity’ context.

A Survey on Principles, Models & Methods for Learning from Irregularly Sampled Time Series: From Discretization to Attention & Invariance
Authors:
Publication Venue:

Advances in Neural Information Processing Systems

Keywords:

irregular sampling, multivariate time series, missing data, discretization, interpolation, recurrence, attention

Publication Date:

January 5, 2021

Abstract:
Irregularly sampled time series data arise naturally in many application domains including biology, ecology, climate science, astronomy, and health. Such data represent fundamental challenges to many classical models from machine learning and statistics due to the presence of non-uniform intervals between observations. However, there has been significant progress within the machine learning community over the last decade on developing specialized models and architectures for learning from irregularly sampled univariate and multivariate time series data. In this survey, we first describe several axes along which approaches to learning from irregularly sampled time series differ including what data representations they are based on, what modeling primitives they leverage to deal with the fundamental problem of irregular sampling, and what inference tasks they are designed to perform. We then survey the recent literature organized primarily along the axis of modeling primitives. We describe approaches based on temporal discretization, interpolation, recurrence, attention and structural invariance. We discuss similarities and differences between approaches and highlight primary strengths and weaknesses.
TL;DR:

In this survey, we first describe several axes along which approaches to learning from irregularly sampled time series differ including what data representations they are based on, what modeling primitives they leverage to deal with the fundamental problem of irregular sampling, and what inference tasks they are designed to perform. We then survey the recent literature organized primarily along the axis of modeling primitives.

Heteroscedastic Temporal Variational Autoencoder for Irregularly Sampled Time Series

Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series
https://github.com/reml-lab/hetvae
12 forks.
30 stars.
3 open issues.

Recent commits:

Authors:
Publication Venue:

International Conference on Learning Representations (ICLR)

Publication Date:

January 28, 2022

License:
Languages:

Jupyter Notebook

Python

BayesLDM: A Domain-Specific Language for Probabilistic Modeling of Longitudinal Data
Authors:
Publication Venue:

IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)

Publication Date:

September 12, 2022

Language:
License:

Python

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation
Authors:
Publication Venue:

Neural Information Processing Systems (NeurIPS)

Publication Date:

September 16, 2022

Language:
License:

Python

Uncertainty Quantification Using Query-Based Object Detectors
Authors:
Publication Venue:

Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, Proceedings, Part VIII. Pages 78-93

Publication Date:

February 12, 2023

Language:

Markdown Documentation

Assessing the Impact of Context Inference Error & Partial Observability on RL Methods for Just-In-Time Adaptive Interventions
Authors:
Publication Venue:

Conference on Uncertainty in Artificial Intelligence (UAI 2023)

Publication Date:

May 17, 2023

Language:
License:

Python

mRisk: Continuous Risk Estimation for Smoking Lapse from Noisy Sensor Data with Incomplete and Positive-Only Labels
Authors:
Publication Venue:

ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT)

Publication Date:

September 7, 2022

Languages:

Jupyter Notebook

Python

Kernel Multimodal Continuous Attention
Heteroscedastic Temporal Variational Autoencoder for Irregularly Sampled Time Series
HeTVAE is a deep learning framework for probabilistic interpolation of irregularly sampled or sparse time series data. HeTVAE has three associated datasets:

Real World Datasets:
Synthetic Dataset:
  • Synthetic Data Generation: We generate a synthetic dataset consisting of 2000 trajectories each consisting of 50 time points with values between 0 and 1. We fix 10 reference time points and draw values for each from a standard normal distribution. We then use an RBF kernel smoother with a fixed bandwidth of α = 120.0 to construct local interpolations over the 50 time points. The data generating process is shown below: We randomly sample 3 − 10 observations from each trajectory to simulate a sparse and irregularly sampled univariate time series.
Authors:
Publication Venue:

International Conference on Learning Representations (ICLR)

Publication Date:

January 28, 2022

License:
PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation

The novel PulseImpute dataset is the first large-scale dataset containing complex imputation tasks for pulsative biophysical signals.  State-of-the-art imputation methods from the time series literature are shown to exhibit poor performance on PulseImpute, demonstrating that the missingness patterns emerging in mHealth applications represent a unique and important class of imputation problems.  By releasing this dataset and a new state-of-the-art baseline algorithm, we hope to spur the ML community to begin addressing these challenging problems.

Authors:
Publication Venue:

Neural Information Processing Systems (NeurIPS)

Publication Date:

November 28, 2022

License:

The past decade has seen tremendous advances in the ability to compute a diverse array of mobile sensor-based biomarkers in order to passively estimate health states, activities, and associated contexts (e.g. physical activity, sleep, smoking, mood, craving, stress, and geospatial context). Researchers are now engaged in the conduct of both observational and interventional field studies of increasing complexity and length that leverage mHealth sensor and biomarker technologies combined with the collection of measures of disease progression and other outcomes. 

 

As a result of the expansion of the set of available mHealth biomarkers and the push toward long-term, real-world deployment of mHealth technologies, a new set of critical gaps has emerged that were previously obscured by the focus of the field on smaller-scale proof-of-concept studies and the investigation of single biomarkers in isolation.

Solutions for Missing Sensor & Biomarker Data

First, the issue of missing sensor and biomarker data in mHealth field studies has quickly become a critical problem that directly and significantly impacts many of our CPs. Issues including intermittent wireless dropouts, wearables and smartphones running out of battery power, participants forgetting to carry or wear devices, and participants exercising privacy controls can all contribute to complex patterns of missing data that significantly complicate data analysis and limit the effectiveness of sensor-informed mHealth interventions.

High-Quality, Compact, & Interprerable Feature Representations

Second, with increasing interest in the use of reinforcement learning methods to provide online adaptation of interventions for every individual, there is an urgent need for high-quality, compact and interpretable feature representations that can enable more effective learning under strict budgets on the number of interactions with patients.

Methods for Deriving High-Level Knowledge & Supporting Causal Hypothesis Generation

Finally, as in other areas that are leveraging machine learning methods to drive scientific discovery and support decision making, mHealth needs methods that can be used to derive high-level knowledge and support causal hypothesis generation based on complex, non-linear models fit to biomarker time series data.

Sayma Akther, PhD

Assistant Professor


Supriya Nagesh, PhD

Applied Scientist


Varol Burak Aydemir, PhD

Principal Algorithms Engineer


Satya Shukla, PhD

Senior Research Scientist


Soujanya Chatterjee, PhD

Applied Scientist II


Md Azim Ullah, PhD

Applied Scientist


Alexander Moreno, PhD

Machine Learning Scientist


  1. S.N. Shukla, B.M. Marlin. Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series. In Proceedings of the International Conference on Learning Representations. 2022.
  2. Tung, K., Torre, S.D., Mistiri, M.E., Braganca, R.B., Hekler, E.B., Pavel, M., Rivera, D.E., Klasnja, P., Spruijt-Metz, D., & Marlin, B.M. (2022). BayesLDM: A Domain-Specific Language for Probabilistic Modeling of Longitudinal Data. Accepted at IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) 2022.  ArXiv, abs/2209.05581.
  3. M. A. Xu, A. Moreno, S. Nagesh, V. B. Aydemir, D. W. Wetter, S. Kumar, and J. M. Rehg. PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation. Proceedings 36th Conference on Neural Information Processing Systems (NeurIPS), Track on Datasets and Benchmarks, 2022. Accepted for publication.  NIHMS1839168.
  4. Md Azim Ullah, Soujanya Chatterjee, Christopher P. Fagundes, Cho Lam, Inbal Nahum-Shani, James M. Rehg, David W. Wetter, and Santosh Kumar. 2022. mRisk: Continuous Risk Estimation for Smoking Lapse from Noisy Sensor Data with Incomplete and Positive-Only Labels. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 3, Article 143 (September 2022), 29 pages.
  5. A. Moreno, Z. Wu, S. Nagesh, W. Dempsey, and J. M. Rehg. Kernel Multimodal Continuous Attention. Proceedings 36th Conference on Neural Information Processing Systems (NeurIPS), 2022. Accepted for publication.
  6. Chatterjee S, Moreno A, Lizotte SL, Akther S, Ertin E, Fagundes CP, Lam C, Rehg JM, Wan N, Wetter DW, Kumar S. SmokingOpp: Detecting the Smoking 'Opportunity' Context Using Mobile Sensors. Proc ACM Interact Mob Wearable Ubiquitous Technol. 2020 Mar;4(1):4. doi: 10.1145/3380987. Epub 2020 Mar 18. PMID: 34651096; PMCID: PMC8513752.
  7. Shukla SN, Marlin BM. A Survey on Principles, Models and Methods for Learning from Irregularly Sampled Time Series. arXiv preprint arXiv:2012.00168.
  8. Meet P. Vadera, Colin Samplawski, and Benjamin M. Marlin. 2023. Uncertainty Quantification Using Query-Based Object Detectors. In Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VIII. Springer-Verlag, Berlin, Heidelberg, 78–93. DOI: 10.1007/978-3-031-25085-9_5.
  9. Karine K, Klasnja P, Murphy SA, Marlin BM. Assessing the Impact of Context Inference Error and Partial Observability on RL Methods for Just-In-Time Adaptive Interventions. Proc Mach Learn Res. 2023 Aug;216:1047-1057. PMID: 37724310; PMCID: PMC10506656.
  10. Chow SM, Nahum-Shani I, Baker JT, Spruijt-Metz D, Allen NB, Auerbach RP, Dunton GF, Friedman NP, Intille SS, Klasnja P, Marlin B, Nock MK, Rauch SL, Pavel M, Vrieze S, Wetter DW, Kleiman EM, Brick TR, Perry H, Wolff-Hughes DL; Intensive Longitudinal Health Behavior Network (ILHBN). The ILHBN: Challenges, Opportunities, and Solutions from Harmonizing Data under Heterogeneous Study Designs, Target Populations, and Measurement Protocols. Transl Behav Med. 2023 Jan 20;13(1):7-16. doi: 10.1093/tbm/ibac069. Erratum in: Transl Behav Med. 2023 Jun 9;13(6):419. PMID: 36416389; PMCID: PMC9853092.
  11. Yang MJ, Sutton SK, Hernandez LM, Jones SR, Wetter DW, Kumar S, Vinci C. A Just-In-Time Adaptive intervention (JITAI) for smoking cessation: Feasibility and acceptability findings. Addict Behav. 2023 Jan;136:107467. doi: 10.1016/j.addbeh.2022.107467. Epub 2022 Aug 23. PMID: 36037610; PMCID: PMC10246550.
  12. Sameer Neupane, Mithun Saha, Nasir Ali, Timothy Hnat, Shahin Alan Samiei, Anandatirtha Nandugudi, David M. Almeida, and Santosh Kumar. 2024. Momentary Stressor Logging and Reflective Visualizations: Implications for Stress Management with Wearables. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), May 11–16, 2024, Honolulu, HI, USA. ACM, New York, NY, USA, 27 pages. DOI: 10.1145/3613904.3642662.
  1. S. Kumar, “Challenges and Opportunities in Trustworthy AI for Health and Wellness” ACM SIGKDD Trustworthy AI Day, 08/15/22.
  2. S. Kumar, “Detecting and Characterizing Stress in Daily Life,” Keynote Speech at IEEE EMBC Workshop on Detection of Stress and Mental Health Using Wearable Sensors, 07/11/2022.
  3. S. Kumar, “Can Sharing Anonymous Wrist-worn Accelerometry Data Re-identify You,” EECS Department, University of California, Irvine, 06/03/2022.
  4. S. Kumar, “Can Sharing Anonymous Wrist-worn Accelerometry Data Re-identify You,” CSE Department, The Ohio State University, 04/29/2022.
  5. S. Kumar, “Persuasive AI to Improve Health and Wellness,” Indo-US Roundtable, 03/24/2022.
  6. S. Kumar, “Wearable AI for Designing, Optimizing, and Delivering Temporally-Precise mHealth Interventions,” mHealth Special Session at International Conference on Network, Systems, and Security (NSySs’21), 12/23/2021.
  7. S.N. Shukla. "Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series." International Conference on Learning Representations. 4/27/2022.

James Rehg, PhD

Deputy Center Director, TR&D1 Lead


Santosh Kumar, PhD

Lead PI, Center Director, TR&D1, TR&D2, TR&D3


Benjamin Marlin, PhD

Co-Investigator, TR&D1, TR&D2



Sameer Neupane

Doctoral Student


Maxwell Xu

Doctoral Student


Mithun Saha

Doctoral Student


Karine Karine

Doctoral Student


Hui Wei

Doctoral Student