Edinburgh Research Archive

Developing a video expert panel as a reference standard to evaluate respiratory rate counting in paediatric pneumonia diagnosis

Item Status

Embargo End Date

Authors

Khan, Ahad Mahmud

Abstract

Pneumonia is a leading cause of death in children aged below five years, especially in low- and middle-income countries (LMICs). According to the World Health Organization (WHO) Integrated Management of Childhood Illness (IMCI) guidelines, the diagnosis of pneumonia is primarily based on fast breathing and chest indrawing. Non-physician health workers play a crucial role in identifying and treating pneumonia in LMICs. Identifying these two signs is challenging for health workers, often leading to misdiagnosis of pneumonia and inappropriate treatment. Some improved pneumonia diagnostics (e.g., ChARM, Rad-G) can count respiratory rate (RR) automatically and identify fast breathing. These devices can support health workers in improving the diagnosis of pneumonia. There is no recognised gold standard reference for evaluating the performance of health workers or assessing the accuracy of automated RR counters. Most existing studies have used manual RR counting by a clinical professional (e.g., physician, nurse) as the reference standard, which is not ideal. Experts recommend that a video expert panel (VEP) adjudication may be a better reference standard for evaluating RR counting for research studies. If quality videos could be captured and a VEP could systematically interpret RR from these videos, it could be an ideal and non-biased reference standard. Objectives My PhD aims to develop a VEP as a reference standard to evaluate RR counting to diagnose childhood pneumonia. The following table summarises the objectives and proposed study designs. Methods and Results 1. Ability of non-physician health workers to measure RR to detect pneumonia in children in LMICs I conducted a systematic review and meta-analysis by searching four electronic databases (MEDLINE, EMBASE, Web of Science, and Scopus) and reference lists of the included studies. I included those studies in which the performance of health workers in measuring RR and/or identifying fast breathing compared to a reference standard was assessed. I reported pooled estimates of sensitivity and specificity of fast breathing identification using the bivariate random-effects models. Sixteen studies were included in the review, eight of which reported the agreement in RR count between health workers and a reference standard. The median agreements were 39% within ±2 breaths per minute (bpm), 47% within ±3 bpm, and 67% within ±5 bpm. Fifteen studies were identified reporting the accuracy of a health worker in identifying fast breathing compared to a reference standard. The median sensitivity and specificity were 77% and 86%, respectively. Seven studies were selected for the meta-analysis. The pooled sensitivity was 78% (95% confidence interval (CI) = 72-82), and the pooled specificity was 86% (95% CI = 78-91). 2. Ability of non-physician health workers to identify chest indrawing to detect pneumonia in children in LMICs I conducted another systematic review and meta-analysis by searching the same four databases and citation indices. I included studies assessing the performance of non-physician health workers in detecting chest indrawing compared to a reference standard. I reported pooled estimates of sensitivity and specificity of chest indrawing identification using the bivariate random-effects models. The median sensitivity and specificity were 44% and 97%, respectively. Five studies were included in the meta-analysis. The pooled sensitivity and specificity were 46% (95% CI = 37-56), and 95% (95% CI = 91- 97), respectively. 3. Factors associated with the interpretability and agreement in RR counts between paediatricians I recruited 30 hospitalized children younger than five years with suspected pneumonia from the Institute of Child and Mother Health (ICMH), Dhaka, Bangladesh. I video-recorded a study physician counting RR manually twice and using a ChARM device twice from each child. Two paediatricians evaluated the videos and counted RR. Interpretability was defined as both paediatricians being able to interpret RR, and agreement was defined as the difference in RR counts being within two bpm. Associations of interpretability and agreement with child and video methodological factors were explored using multivariable logistic regression. Eighty-five videos of RR measurements were recorded. There was agreement in the ability to interpret RR in 73/85 (85.9%) videos and agreement for RR counts in 54/73 (74.0%) videos. A higher adjusted odds ratio (aOR) for the ability of interpretability in RR counts was found if the child was calm or sleeping (aOR=73.6; 95% CI, 9.1, 594.0) compared to if the child was moving and lying on the bed (aOR=22.8; 95% CI, 2.9, 179.0) compared to on the parent’s lap. A higher aOR for agreement in RR counts was found if the child was calm (aOR=16.8; 95% CI, 2.9, 95.4) and sleeping (aOR=27.1; 95% CI, 4.2, 176.4) compared to if the child was moving, and a RR<40 bpm (aOR=12.9; 95% CI, 1.3, 126.2) compared to a RR≥60 bpm. 4. Standardisation of physicians to interpret chest videos for RR counting I recruited 56 hospitalized under-five children with suspected pneumonia from ICMH, Dhaka, Bangladesh. A physician counted the RR manually and also using ChARM, and video recording was performed concurrently. Each video was randomly assigned to two paediatricians. If paediatricians differed in interpretability or in RR counts (difference in RR counts more than two bpm), the video was assigned to the third paediatrician. If two paediatricians reached a consensus (difference in RR counts within two bpm), the video was considered a reference video and the average RR was considered the final RR. Seventy-nine videos were generated as reference videos which were used for the training and standardisation of six physicians to interpret chest videos for RR counting. Twenty videos were purposively selected for the standardisation process. The agreement between physician RR and reference video RR ranged from 80% to 100% when RR was counted manually. On the other hand, the agreement ranged from 70% to 90% when RR was counted using ChARM. 5. Performance of VEP to evaluate RR count from recorded videos I enrolled 339 children with suspected pneumonia from three selected community clinics (CCs) and a subdistrict hospital (Zakiganj Upazila Health Complex (UHC)) in Sylhet and ICMH in Dhaka. The RR was counted manually and also using ChARM, and the child’s chest movements were videotaped simultaneously. The videos were sent to two randomly selected panel members. The video was transferred to a third member if both members disagreed in interpretability or in RR counts. If there were inconsistencies among all three members, the video was forwarded to a fourth member. The average of two closely interpreted RRs (within two bpm) was considered the final RR. Of 605 recorded videos (both with and without ChARM), the VEP classified 544 (89.9%) as interpretable and RR within two bpm, 12 (2.0%) as interpretable and RR more than two bpm, and 49 (8.1%) as uninterpretable. Five out of six primary readers (VEP members) had an inter-reader agreement range of 73% to 82% on the RR count. Their agreement with the final panel reading ranged from 92% to 99%. The same readers reevaluated 20% of the videos, and the intra-reader agreement varied from 91% to 100%. One of the primary readers had a relatively low performance, with an inter-reader agreement of 54% with other readers, 66% for the final panel reading, and an intra-reader agreement of 75%. Paediatricians evaluated 10% of the videos interpreted by VEP, and they had an 89% agreement on the RR count. 6. Performance of healthcare staff in manual counting of RR Using the VEP as the reference standard, I evaluated the performance of physicians and Community Health Care Providers (CHCPs) in counting RR manually and identifying fast breathing. From ICMH and UHC, the physicians enrolled 216 children and were able to count the RR of 201 children. Among those, the VEP reached a consensus in RR count of 184 children. The agreement in RR count within two bpm between the physician and VEP was 67.9%, with a mean difference of 0.6 bpm, 95% CI limits of agreement (-4.9 to 6.0 bpm). The agreement in the classification of fast and normal breathing was almost perfect (kappa=0.94), with a sensitivity of 98.4% (95% CI: 91.2 – 99.9) and a specificity of 96.7% (91.9 – 99.1). From three CCs, the CHCPs enrolled 123 children and counted the RR of 110 children. Among those, the VEP agreed on RR counts of 103 children. The agreement of CHCP in counting RR was 77.7% within two bpm with the VEP, with a mean difference of 0.8 bpm, 95% CI limits of agreement (-4.9 to 6.5 bpm). CHCP identified fast breathing with a sensitivity of 100.0% and a specificity of 95.6%, and there was an almost perfect agreement (kappa=0.84). 7. Performance of ChARM in counting RR Using the VEP as the reference standard, I evaluated the performance of ChARM in terms of the accuracy of counting RR and the time to count RR. From ICMH, UHC and CCs, 339 children were enrolled, and ChARM counted the RR of 294 children. Among them, the VEP reached a consensus in RR counts of 257 children. The ChARM and VEP agreed on RR counts within two bpm in 68.1% of children, with a mean difference of 1.7 bpm, 95% CI limits of agreement (-6.7 to 10.2 bpm). ChARM classified fast and normal breathing with a sensitivity of 95.8% and a specificity of 93.5%, and the agreement was almost perfect (kappa=0.86). The median time required by the ChARM to count RR was 66 seconds (IQR: 61 – 73 seconds) which was slightly longer than the time required for manual count. 8. Effect of ChARM on respiratory rate I attempted to explore the effect of the ChARM on the RR compared to the standard observation technique. I considered RR counts by the VEP to assess the difference in RR counts and the change in RR classification (fast and normal breathing) from manual to ChARM measurements. Of the 339 enrolled children, RR was counted both manually and with ChARM without altering the child’s condition in 256 children. Among those, the VEP reached a consensus in RR counts during both measurements in 217 children. The mean difference in RR counts between ChARM and manual measurements was 0.5 bpm with the standard error of mean (SEM) of 0.4 bpm. The classification of RR changed in about 9.7% of the children with a kappa of 0.749. Conclusions This thesis provides evidence of the need to improve health workers' performance in LMICs to identify fast breathing and chest indrawing to improve the diagnosis of childhood pneumonia. The use of an automated RR counting device like ChARM can support health workers in the accurate detection of fast breathing. The findings from this thesis support the rationale for additional studies on the performance and implementation of ChARM and other automated RR counters to inform policy decisions in LMICs. Furthermore, automated devices need to be developed for assessing chest indrawing in children. The VEP can be an ideal reference standard for evaluating the performance of health workers and automated devices in assessing pneumonia signs for research studies. Further studies should be conducted to improve inter- and intra-reader agreement among the panel members and the overall performance of the VEP.

This item appears in the following Collection(s)