Developing a video expert panel as a reference standard to evaluate respiratory rate counting in paediatric pneumonia diagnosis
Files
Item Status
Embargo End Date
Date
Authors
Khan, Ahad Mahmud
Abstract
Pneumonia is a leading cause of death in children aged below five years,
especially in low- and middle-income countries (LMICs). According to the
World Health Organization (WHO) Integrated Management of Childhood
Illness (IMCI) guidelines, the diagnosis of pneumonia is primarily based on fast
breathing and chest indrawing. Non-physician health workers play a crucial
role in identifying and treating pneumonia in LMICs. Identifying these two signs
is challenging for health workers, often leading to misdiagnosis of pneumonia
and inappropriate treatment. Some improved pneumonia diagnostics (e.g.,
ChARM, Rad-G) can count respiratory rate (RR) automatically and identify fast
breathing. These devices can support health workers in improving the
diagnosis of pneumonia. There is no recognised gold standard reference for
evaluating the performance of health workers or assessing the accuracy of
automated RR counters. Most existing studies have used manual RR counting
by a clinical professional (e.g., physician, nurse) as the reference standard,
which is not ideal. Experts recommend that a video expert panel (VEP)
adjudication may be a better reference standard for evaluating RR counting for
research studies. If quality videos could be captured and a VEP could
systematically interpret RR from these videos, it could be an ideal and non-biased reference standard.
Objectives
My PhD aims to develop a VEP as a reference standard to evaluate RR
counting to diagnose childhood pneumonia. The following table summarises
the objectives and proposed study designs. Methods and Results
1. Ability of non-physician health workers to measure RR to detect
pneumonia in children in LMICs
I conducted a systematic review and meta-analysis by searching four
electronic databases (MEDLINE, EMBASE, Web of Science, and Scopus) and
reference lists of the included studies. I included those studies in which the
performance of health workers in measuring RR and/or identifying fast
breathing compared to a reference standard was assessed. I reported pooled
estimates of sensitivity and specificity of fast breathing identification using the
bivariate random-effects models. Sixteen studies were included in the review,
eight of which reported the agreement in RR count between health workers
and a reference standard. The median agreements were 39% within ±2
breaths per minute (bpm), 47% within ±3 bpm, and 67% within ±5 bpm. Fifteen
studies were identified reporting the accuracy of a health worker in identifying
fast breathing compared to a reference standard. The median sensitivity and
specificity were 77% and 86%, respectively. Seven studies were selected for
the meta-analysis. The pooled sensitivity was 78% (95% confidence interval
(CI) = 72-82), and the pooled specificity was 86% (95% CI = 78-91).
2. Ability of non-physician health workers to identify chest
indrawing to detect pneumonia in children in LMICs
I conducted another systematic review and meta-analysis by searching the
same four databases and citation indices. I included studies assessing the
performance of non-physician health workers in detecting chest indrawing
compared to a reference standard. I reported pooled estimates of sensitivity
and specificity of chest indrawing identification using the bivariate random-effects models. The median sensitivity and specificity were 44% and 97%,
respectively. Five studies were included in the meta-analysis. The pooled
sensitivity and specificity were 46% (95% CI = 37-56), and 95% (95% CI = 91-
97), respectively.
3. Factors associated with the interpretability and agreement in RR
counts between paediatricians
I recruited 30 hospitalized children younger than five years with suspected
pneumonia from the Institute of Child and Mother Health (ICMH), Dhaka,
Bangladesh. I video-recorded a study physician counting RR manually twice
and using a ChARM device twice from each child. Two paediatricians
evaluated the videos and counted RR. Interpretability was defined as both
paediatricians being able to interpret RR, and agreement was defined as the
difference in RR counts being within two bpm. Associations of interpretability
and agreement with child and video methodological factors were explored
using multivariable logistic regression. Eighty-five videos of RR measurements
were recorded. There was agreement in the ability to interpret RR in 73/85
(85.9%) videos and agreement for RR counts in 54/73 (74.0%) videos. A
higher adjusted odds ratio (aOR) for the ability of interpretability in RR counts
was found if the child was calm or sleeping (aOR=73.6; 95% CI, 9.1, 594.0)
compared to if the child was moving and lying on the bed (aOR=22.8; 95% CI,
2.9, 179.0) compared to on the parent’s lap. A higher aOR for agreement in
RR counts was found if the child was calm (aOR=16.8; 95% CI, 2.9, 95.4) and
sleeping (aOR=27.1; 95% CI, 4.2, 176.4) compared to if the child was moving,
and a RR<40 bpm (aOR=12.9; 95% CI, 1.3, 126.2) compared to a RR≥60 bpm.
4. Standardisation of physicians to interpret chest videos for RR
counting
I recruited 56 hospitalized under-five children with suspected pneumonia from
ICMH, Dhaka, Bangladesh. A physician counted the RR manually and also
using ChARM, and video recording was performed concurrently. Each video
was randomly assigned to two paediatricians. If paediatricians differed in
interpretability or in RR counts (difference in RR counts more than two bpm),
the video was assigned to the third paediatrician. If two paediatricians reached
a consensus (difference in RR counts within two bpm), the video was
considered a reference video and the average RR was considered the final
RR. Seventy-nine videos were generated as reference videos which were
used for the training and standardisation of six physicians to interpret chest
videos for RR counting. Twenty videos were purposively selected for the
standardisation process. The agreement between physician RR and reference
video RR ranged from 80% to 100% when RR was counted manually. On the
other hand, the agreement ranged from 70% to 90% when RR was counted
using ChARM.
5. Performance of VEP to evaluate RR count from recorded videos
I enrolled 339 children with suspected pneumonia from three selected
community clinics (CCs) and a subdistrict hospital (Zakiganj Upazila Health
Complex (UHC)) in Sylhet and ICMH in Dhaka. The RR was counted manually
and also using ChARM, and the child’s chest movements were videotaped
simultaneously. The videos were sent to two randomly selected panel
members. The video was transferred to a third member if both members
disagreed in interpretability or in RR counts. If there were inconsistencies
among all three members, the video was forwarded to a fourth member. The
average of two closely interpreted RRs (within two bpm) was considered the
final RR. Of 605 recorded videos (both with and without ChARM), the VEP
classified 544 (89.9%) as interpretable and RR within two bpm, 12 (2.0%) as
interpretable and RR more than two bpm, and 49 (8.1%) as uninterpretable.
Five out of six primary readers (VEP members) had an inter-reader agreement
range of 73% to 82% on the RR count. Their agreement with the final panel
reading ranged from 92% to 99%. The same readers reevaluated 20% of the
videos, and the intra-reader agreement varied from 91% to 100%. One of the
primary readers had a relatively low performance, with an inter-reader
agreement of 54% with other readers, 66% for the final panel reading, and an
intra-reader agreement of 75%. Paediatricians evaluated 10% of the videos
interpreted by VEP, and they had an 89% agreement on the RR count.
6. Performance of healthcare staff in manual counting of RR
Using the VEP as the reference standard, I evaluated the performance of
physicians and Community Health Care Providers (CHCPs) in counting RR
manually and identifying fast breathing. From ICMH and UHC, the physicians
enrolled 216 children and were able to count the RR of 201 children. Among
those, the VEP reached a consensus in RR count of 184 children. The
agreement in RR count within two bpm between the physician and VEP was
67.9%, with a mean difference of 0.6 bpm, 95% CI limits of agreement (-4.9 to
6.0 bpm). The agreement in the classification of fast and normal breathing was
almost perfect (kappa=0.94), with a sensitivity of 98.4% (95% CI: 91.2 – 99.9)
and a specificity of 96.7% (91.9 – 99.1). From three CCs, the CHCPs enrolled
123 children and counted the RR of 110 children. Among those, the VEP
agreed on RR counts of 103 children. The agreement of CHCP in counting RR
was 77.7% within two bpm with the VEP, with a mean difference of 0.8 bpm,
95% CI limits of agreement (-4.9 to 6.5 bpm). CHCP identified fast breathing
with a sensitivity of 100.0% and a specificity of 95.6%, and there was an almost
perfect agreement (kappa=0.84).
7. Performance of ChARM in counting RR
Using the VEP as the reference standard, I evaluated the performance of
ChARM in terms of the accuracy of counting RR and the time to count RR.
From ICMH, UHC and CCs, 339 children were enrolled, and ChARM counted
the RR of 294 children. Among them, the VEP reached a consensus in RR
counts of 257 children. The ChARM and VEP agreed on RR counts within two
bpm in 68.1% of children, with a mean difference of 1.7 bpm, 95% CI limits of
agreement (-6.7 to 10.2 bpm). ChARM classified fast and normal breathing
with a sensitivity of 95.8% and a specificity of 93.5%, and the agreement was
almost perfect (kappa=0.86). The median time required by the ChARM to
count RR was 66 seconds (IQR: 61 – 73 seconds) which was slightly longer
than the time required for manual count.
8. Effect of ChARM on respiratory rate
I attempted to explore the effect of the ChARM on the RR compared to the
standard observation technique. I considered RR counts by the VEP to assess
the difference in RR counts and the change in RR classification (fast and
normal breathing) from manual to ChARM measurements. Of the 339 enrolled
children, RR was counted both manually and with ChARM without altering the
child’s condition in 256 children. Among those, the VEP reached a consensus
in RR counts during both measurements in 217 children. The mean difference
in RR counts between ChARM and manual measurements was 0.5 bpm with
the standard error of mean (SEM) of 0.4 bpm. The classification of RR changed
in about 9.7% of the children with a kappa of 0.749.
Conclusions
This thesis provides evidence of the need to improve health workers'
performance in LMICs to identify fast breathing and chest indrawing to improve
the diagnosis of childhood pneumonia. The use of an automated RR counting
device like ChARM can support health workers in the accurate detection of fast
breathing. The findings from this thesis support the rationale for additional
studies on the performance and implementation of ChARM and other
automated RR counters to inform policy decisions in LMICs. Furthermore,
automated devices need to be developed for assessing chest indrawing in
children. The VEP can be an ideal reference standard for evaluating the
performance of health workers and automated devices in assessing
pneumonia signs for research studies. Further studies should be conducted to
improve inter- and intra-reader agreement among the panel members and the
overall performance of the VEP.
This item appears in the following Collection(s)

