- Research Article
- 10.1145/3749502
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Alvise Dei Rossi + 5 more
Perceived sleep quality is a key aspect of sleep health and a crucial factor in mental health. However, predicting it accurately is difficult because of its deeply personal nature and the considerable variability in how individuals perceive their sleep at night. This study presents a robust subject-wise nested cross-validation framework for passive daily monitoring of perceived sleep quality using wearable data through population-level machine learning modeling. A total of 294 participants (mean age 42 (SD = 10) years; 43% female) were monitored over 30 days employing commercial wearable devices in free-living conditions, with daily self-reported sleep quality. A novel adaptation of the person-mean centering approach was employed to split time-varying features into within-person and between-person components, preventing temporal leakage and enabling unbiased daily prediction. Various machine learning models were trained, and SHAP values were used to identify key predictors. Our results show that fully passive prediction of perceived sleep quality is feasible at population-level from the first day of monitoring (ROC AUC 0.715, F1 0.494, BA 0.666), with within-person deviations from individual baselines being the primary predictors. The most influential predictors were found to be deviations in sleep duration and continuity, followed by cardiac, stress-related features, and SF-12 health survey components.
- Research Article
- 10.1145/3749496
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Xiaofan Yu + 7 more
Natural language interaction with sensing systems is crucial for addressing users' personal concerns and providing health-related insights into their daily lives. When a user asks a question, the system automatically analyzes the full history of sensor data, extracts relevant information, and generates an appropriate response. However, existing systems are limited to short-duration (e.g., one minute) or low-frequency (e.g., daily step count) sensor data. In addition, they struggle with quantitative questions that require precise numerical answers. In this work, we introduce SensorChat, the first end-to-end QA system designed for daily life monitoring using long-duration, high-frequency time series data. Given raw sensor signals spanning multiple days and a user-defined natural language question, SensorChat generates semantically meaningful responses that directly address users' concerns. SensorChat effectively handles both quantitative questions that require numerical precision and qualitative questions that require high-level reasoning to infer subjective insights. To achieve this, SensorChat uses an innovative three-stage pipeline that includes question decomposition, sensor data query, and answer assembly. The first and third stages leverage Large Language Models (LLMs) to interpret human queries and generate responses. The intermediate querying stage extracts relevant information from the complete sensor data history, which is then combined with the original query in the final stage to produce accurate and meaningful answers. Real-world implementations demonstrate SensorChat's capability for real-time interactions on a cloud server while also being able to run entirely on an edge platform after quantization. Comprehensive QA evaluations show that SensorChat achieves 93% higher answer accuracy than the best performing state-of-the-art systems on quantitative questions. Furthermore, a user study with eight volunteers highlights SensorChat's effectiveness in answering qualitative and open-ended questions. The code is available at https://github.com/Orienfish/SensorChat.
- Research Article
- 10.1145/3749472
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Ruyi Li + 6 more
Remote assistance through robotic telepresence could involve both control and memory challenges, particularly in one expert to multiple workers situation. In this work, we proposed a novelty language-driven interface to facilitate remote collaboration through telepresence robots. Through operations and maintenance expert interviews and a scenario simulation study, we identified key pain points in executing one-expert-multiple-workers remote guidance using the telepresence robot and proposed two design goals, which together consist of five sub-design goals with corresponding features. These features were integrated into a standard telepresence robot, resulting in the development of a Collaborative LLM-based Embodied Assistant Robot, named CLEAR Robot. A controlled experiment simulating a remote assembly task of one to two demonstrated that, compared to the standard telepresence robot, CLEAR Robot significantly improved efficiency, reduced cognitive load, facilitated more balanced collaboration, and improved the user experience. We also discuss the impact of language-driven implicit interactions in multi-user collaboration and provide insights for designing robot systems that support one-expert-multiple-workers remote guidance in the future.
- Research Article
- 10.1145/3749540
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Goshi Imamura + 1 more
This paper shows the method for evaluating the quality of presentations using information from the audience's facial and head behavior. Existing studies have focused on assessing presentation skills by analyzing presentation materials and the presenter's verbal and non-verbal cues, however, since a presentation is an interaction between the presenter and the audience, the presentation quality is not only influenced by presenter's information but also audience's information such as their background knowledge. This means that the presenter's output alone is not enough for a comprehensive evaluation of the presentation. To address this, we show a method that utilizes videos of the audience recorded during presentations using two cameras: one capturing from the presenter's point of view and the other from around the presentation screen. The 3D position and direction of the audience's face were detected using face detection techniques, which were then used to predict gaze targets, such as the presenter and the presentation screen. The number of times these targets were looked at during the presentation served as feature values for evaluating presentation quality. Using a dataset of 18 presenters and 32 audience members, this paper shows a significant correlation between subjective ratings and behavioral characteristics. It also demonstrates an algorithm that infers the quality of a presentation from these feature values.
- Research Article
- 10.1145/3749481
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Phuc Duc Nguyen + 3 more
Domain shifts due to microphone hardware heterogeneity pose challenges to machine learning-based acoustic sensing. Existing methods enhance empirical performance but lack theoretical understanding. This paper proposes Certified Adaptive Physics-informed transform (CertiAPT), an approach that provides formal certification on the model accuracy and improves empirical performance against microphone-induced domain shifts. CertiAPT incorporates a novel Adaptive Physics-informed Transform (APT) to create transformations toward the target microphone without requiring application samples collected by the target microphone. It also establishes a theoretical upper bound on accuracy degradation due to microphone characteristic differences on unseen microphones. Furthermore, a robust training method with an APT gradient update scheme leverages APT and certification constraints to tighten the upper bound and improve empirical accuracy across sensor conditions. Extensive experiments on three acoustic sensing tasks, including keyword spotting, room recognition, and automated speech recognition, validate CertiAPT's certified robustness and show accuracy gains, compared with the latest approaches. Our implementation of CertiAPT is available at: https://github.com/bibom108/CertiAPT.
- Research Article
- 10.1145/3749497
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Di Liu + 6 more
Art therapy homework is essential for fostering clients' reflection on daily experiences between sessions. However, current practices present challenges: clients often lack guidance for completing tasks that combine art-making and verbal expression, while therapists find it difficult to track and tailor homework. How HCI systems might support art therapy homework remains underexplored. To address this, we present TherAlssist, comprising a chent-facing application leveraging human-Al co-creative art-making and conversational agents to facilitate homework, and a therapist-facing application enabling customization of homework agents and Al-compiled homework history. A 30-day field study with 24 clients and 5 therapists showed how TherAlssist supported clients' homework and reflection in their everyday settings. Results also revealed how therapists infused their practice principles and personal touch into the agents to offer tailored homework, and how Al-compiled homework history became a meaningful resource for in-session interactions. Implications for designing human-Al systems to facilitate asynchronous client-practitioner collaboration are discussed.
- Research Article
- 10.1145/3749517
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Nan Gao + 10 more
Parental involvement in homework is a crucial aspect of family education, but it often triggers emotional strain and conflicts. Despite growing concern over its impact on family well-being, prior research has lacked access to fine-grained, real-time dynamics of these interactions. To bridge this gap, we present a framework that leverages naturalistic parent-child interaction data and large language models (LLMs) to analyse homework conversations at scale. In a four-week in situ study with 78 Chinese families, we collected 475 hours of audio recordings and accompanying daily surveys, capturing 602 homework sessions in everyday home settings. Our LLM-based pipeline reliably extracted and categorised parental behaviours and conflict patterns from transcribed conversations, achieving high agreement with expert annotations. The analysis revealed significant emotional shifts in parents before and after homework, 18 recurring parental behaviours and seven common conflict types, with Knowledge Conflict being the most frequent. Notably, even well-intentioned behaviours were significantly positively correlated with specific conflicts. This work advances ubiquitous computing methods for studying complex family dynamics and offers empirical insights to enrich family education theory and inform more effective parenting strategies and interventions in the future.
- Research Article
- 10.1145/3749550
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Qijia Shao + 9 more
Sleep is a vital physiological state that significantly impacts overall health. Continuous monitoring of sleep posture, heart rate, respiratory rate, and body movement is crucial for diagnosing and managing sleep disorders. Current monitoring solutions often disrupt natural sleep due to discomfort or raise privacy and instrumentation concerns. We introduce PillowSense, a fabric-based sleep monitoring system seamlessly integrated into a pillowcase. PillowSense utilizes a dual-layer fabric design. The top layer comprises conductive fabrics for sensing electrocardiogram (ECG) and surface electromyogram (sEMG), while the bottom layer features pressure-sensitive fabrics to monitor sleep location and movement. The system processes ECG and sEMG signals sequentially to infer multiple sleep variables and incorporates an adversarial neural network to enhance posture classification accuracy. We fabricate prototypes using off-the-shelf hardware and conduct both lab-based and in-the-wild longitudinal user studies to evaluate the system's effectiveness. Across 151 nights and 912.2 hours of real-world sleep data, the system achieves an F1 score of 88% for classifying seven sleep postures, and clinically-acceptable accuracy in vital sign monitoring. PillowSense's comfort, washability, and robustness in multi-user scenarios underscore its potential for unobtrusive, large-scale sleep monitoring.
- Research Article
- 10.1145/3749551
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Qijun Ying + 6 more
Human motion reconstruction has wide applications in health monitoring, human-computer interaction, and virtual reality. While vision-based methods have made significant strides, they face challenges in daily scenarios due to occlusion, privacy concerns, and environmental constraints. Alternative approaches using wearable sensors often require complex device deployment or raise privacy issues. To address these challenges, we explore foot-based sensing as a non-invasive solution that maintains mobility and practicality. Supporting this approach, we construct a dual-modal human motion dataset with synchronized plantar pressure and inertial measurements, demonstrating the feasibility of reconstructing full-body motion using only foot-based sensing through a dual-modal motion reconstruction network. To enhance global motion reconstruction accuracy, we develop a motion-aware trajectory estimation strategy and implement a two-stage reconstruction pipeline that separates orientation estimation from other motion parameters. Our experiments show a Mean Per Joint Position Error of 69.43mm and a Root Trajectory Error of 0.267m for 2-second predictions. This work presents a practical approach for non-invasive and privacy-preserving motion capture. Code and dataset are available for research purposes at this link.
- Research Article
- 10.1145/3749489
- Sep 3, 2025
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Long Fan + 7 more
Chronic respiratory conditions such as Chronic Obstructive Pulmonary Disease (COPD) and asthma often progress insidiously, making early detection vital for effective intervention. Current gold-standard Pulmonary Function Testing (PFT) methods, such as spirometry, evaluate lung function by measuring airflow rates to detect potential obstructions. But, their cost, often several hundred dollars or more, limits their accessibility for regular at-home monitoring. In this paper, we present SpiroSense, a novel system that transforms a smartphone into a portable, low-cost, and accurate PFT device for everyday use by integrating a custom 3D-printed attachment costing just a dozen dollars. However, a critical limitation arises from the smartphone's inherent audio sampling rate (typically 48kHz), which constrains the airflow resolution to 11.9L/s when using conventional cross-correlation-based time delay estimation. This coarse resolution is insufficient to capture key pulmonary metrics, such as a Peak Expiratory Flow (PEF) of 10 L/s, with high fidelity. To address this, we propose SonicFlow, which establishes a foundational airflow rate sensing model based on ultrasonic phase features and improves the airflow rate resolution to 0.148L/s. Furthermore, airflow-induced high-frequency harmonic noise within the 3D-printed model, combined with ambient environmental noise, further complicates accurate sensing. To mitigate this, we introduce NoiseClear, an end-to-end ultrasonic signal enhancement model designed to effectively suppress noise while preserving critical airflow velocity information. We prototype SpiroSense and evaluate its performance on a cohort of 59 participants, including 29 healthy individuals and 30 patients. Experimental results show that SpiroSense achieves average estimation error of 6.44% for Forced Vital Capacity (FVC), 7.42% for Forced Expiratory Volume in one second (FEV1), and 3.01% for the FEV1/FVC ratio.