Visual assessment of changes in activity levels while eating
Item Status
Embargo End Date
Date
Authors
Raza, Muhammad Ahmed
Abstract
This doctoral thesis investigates non-intrusive ways to monitor the changes in activity levels while eating. Elderly individuals encounter a variety of health challenges including sarcopenia which causes a decrease in the number and size of the muscle fibre resulting from normal aging. Monitoring activity levels in the elderly is particularly crucial for healthy aging in place, as it helps detect these health-related issues early and supports independent living.
Past research provides valuable insights into the activities of daily living and various behavioral aspects of the elderly. These are mainly full-body or gait analysis-based approaches. However, to the best of our knowledge, there is a gap in research about vision-based monitoring systems to strictly analyze upper-body motion and capture valuable insights into their performance while the subjects eat in a dining room environment. We aim to address this gap in this thesis. The primary objectives of this research are to establish reliable methods to (1) monitor the eating behavior of the elderly, (2) develop a generalizable model across various subjects to minimize subjective bias, and (3) develop a complete upper-body focused pipeline that generates health statistics and gather insights into both eating behavior and musculoskeletal deterioration over time.
For this purpose, our first major contribution presents a dataset called EatSense using realsense RGB-D to monitor eating activity in a dining room environment. It comprises 135 video sequences of 27 subjects from 13 nationalities, recorded using an RGB-D camera in an uncontrolled setting, with an average duration of 11.5 minutes per video. The dataset features dense frame-wise labels for 16 atomic eating-related actions, with an average of 114.1 actions per video sequence, and provides three levels of label abstraction. EatSense uniquely focuses on upper-body posture and movements, including scenarios with and without wrist weights to simulate changes in motor function.
Two minor contributions following the EatSense dataset are: (1) Evaluation and behavioral assessment using the EatSense dataset with several action recognition and temporal action localization approaches. This study concludes that EatSense is a challenging dataset for both action recognition and temporal action localization algorithms since it has varying lengths of action instances. (2) Explore the impact of face obfuscation methods on pose-based action recognition in healthcare monitoring. The study was limited as it was only evaluated on the EatSense dataset and concluded that face obfuscation strategies that pseudonymize facial features can preserve privacy without significantly degrading the performance of the subsequent tasks.
The second major contribution presents a vision-based approach to assess performance levels while eating, intending to monitor potential performance decline in elderly individuals. We used weights attached to the subjects' wrists to simulate mobility or motor function changes. The study compares hand-crafted feature-based regression methods (Gaussian Mixture Regression, Multilayer Perceptron, and LightGBM) against deep feature-based regression using ST-GCN. Results show that Gaussian Mixture Regression performs slightly better in predicting the degree of performance decline (i.e., weight level) across subjects.
Lastly, our third major contribution presents a comprehensive, fully autonomous vision-based pipeline for monitoring eating activities and assessing musculoskeletal health. The pipeline's key contributions include a multi-purpose video-to-report framework for long-term monitoring, improved action localization in continuous video through relaxed data augmentation and output merging techniques, and the ability to capture trends and generate insights on changes in eating behavior and upper-body movements.
This item appears in the following Collection(s)

