Edinburgh Research Archive

Visual assessment of changes in activity levels while eating

dc.contributor.advisor
Fisher, Bob
dc.contributor.advisor
Sevilla-Lara, Laura
dc.contributor.author
Raza, Muhammad Ahmed
dc.contributor.sponsor
Higher Education Commission, Islamabad
en
dc.contributor.sponsor
School of Informatics, University of Edinburgh
en
dc.date.accessioned
2024-10-14T11:23:30Z
dc.date.available
2024-10-14T11:23:30Z
dc.date.issued
2024-10-14
dc.description.abstract
This doctoral thesis investigates non-intrusive ways to monitor the changes in activity levels while eating. Elderly individuals encounter a variety of health challenges including sarcopenia which causes a decrease in the number and size of the muscle fibre resulting from normal aging. Monitoring activity levels in the elderly is particularly crucial for healthy aging in place, as it helps detect these health-related issues early and supports independent living. Past research provides valuable insights into the activities of daily living and various behavioral aspects of the elderly. These are mainly full-body or gait analysis-based approaches. However, to the best of our knowledge, there is a gap in research about vision-based monitoring systems to strictly analyze upper-body motion and capture valuable insights into their performance while the subjects eat in a dining room environment. We aim to address this gap in this thesis. The primary objectives of this research are to establish reliable methods to (1) monitor the eating behavior of the elderly, (2) develop a generalizable model across various subjects to minimize subjective bias, and (3) develop a complete upper-body focused pipeline that generates health statistics and gather insights into both eating behavior and musculoskeletal deterioration over time. For this purpose, our first major contribution presents a dataset called EatSense using realsense RGB-D to monitor eating activity in a dining room environment. It comprises 135 video sequences of 27 subjects from 13 nationalities, recorded using an RGB-D camera in an uncontrolled setting, with an average duration of 11.5 minutes per video. The dataset features dense frame-wise labels for 16 atomic eating-related actions, with an average of 114.1 actions per video sequence, and provides three levels of label abstraction. EatSense uniquely focuses on upper-body posture and movements, including scenarios with and without wrist weights to simulate changes in motor function. Two minor contributions following the EatSense dataset are: (1) Evaluation and behavioral assessment using the EatSense dataset with several action recognition and temporal action localization approaches. This study concludes that EatSense is a challenging dataset for both action recognition and temporal action localization algorithms since it has varying lengths of action instances. (2) Explore the impact of face obfuscation methods on pose-based action recognition in healthcare monitoring. The study was limited as it was only evaluated on the EatSense dataset and concluded that face obfuscation strategies that pseudonymize facial features can preserve privacy without significantly degrading the performance of the subsequent tasks. The second major contribution presents a vision-based approach to assess performance levels while eating, intending to monitor potential performance decline in elderly individuals. We used weights attached to the subjects' wrists to simulate mobility or motor function changes. The study compares hand-crafted feature-based regression methods (Gaussian Mixture Regression, Multilayer Perceptron, and LightGBM) against deep feature-based regression using ST-GCN. Results show that Gaussian Mixture Regression performs slightly better in predicting the degree of performance decline (i.e., weight level) across subjects. Lastly, our third major contribution presents a comprehensive, fully autonomous vision-based pipeline for monitoring eating activities and assessing musculoskeletal health. The pipeline's key contributions include a multi-purpose video-to-report framework for long-term monitoring, improved action localization in continuous video through relaxed data augmentation and output merging techniques, and the ability to capture trends and generate insights on changes in eating behavior and upper-body movements.
en
dc.identifier.uri
https://hdl.handle.net/1842/42290
dc.identifier.uri
http://dx.doi.org/10.7488/era/5010
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
M. A. Raza, L. Chen, L. Nanbo, and R. B. Fisher, “Eatsense: humancentric, action recognition and localization dataset for understanding eating behaviors and quality of motion assessment,” Image and Vision Computing, vol. 137, p. 104762, 2023
en
dc.relation.hasversion
M. A. Raza, and and R. B. Fisher, “Vision-based approach to assess performance levels while eating.” Machine Vision and Applications 34.6 (2023): 124
en
dc.relation.hasversion
M. A. Raza, C. Lochhead, and R. B. Fisher, “Effect of face obfuscation methods on pose-based action recognition.” International Conference on AI in Healthcare, 2024
en
dc.relation.hasversion
M. A. Raza, and R. B. Fisher, “V2R: A Fully Autonomous Vision-Based System for Analyzing Eating Behaviors and Musculoskeletal Deterioration”. [Submitted]
en
dc.rights.license
C​r​e​a​t​i​v​e ​C​o​m​m​o​n​s: ​A​t​t​r​i​b​u​t​i​o​n (​C​C-​B​Y)
en
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
en
dc.subject
EatSense
en
dc.subject
Action Recognition
en
dc.subject
Temporal Action Localization
en
dc.subject
Eating Behavioral Understanding
en
dc.title
Visual assessment of changes in activity levels while eating
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
RazaMA_2024.pdf
Size:
10.55 MB
Format:
Adobe Portable Document Format
Description:

This item appears in the following Collection(s)