Applying missing data methods to routine data using the example of a population-based register of patients with diabetes
dc.contributor.advisor
Lewis, Steff
en
dc.contributor.advisor
Wild, Sarah
en
dc.contributor.author
Read, Stephanie Helen
en
dc.contributor.sponsor
Medical Research Council (MRC)
en
dc.date.accessioned
2017-04-10T13:12:36Z
dc.date.available
2017-04-10T13:12:36Z
dc.date.issued
2015-07-04
dc.description.abstract
BACKGROUND:
Routinely-collected data offer great potential for epidemiological research and could be
used to make randomised controlled trials (RCTs) more efficient. The use of routine
data for research has been limited by concerns surrounding data quality, particularly data
completeness. To fully exploit these information-rich data sources it is necessary to
identify approaches capable of overcoming high proportions of missing data.
Using a 2008 extract of the Scottish Care Information – Diabetes Collaboration (SCIDC)
database, a population-based register of people with a diagnosis of diabetes in
Scotland, I compared the findings of several methods for handling missing data in a
retrospective cohort study investigating the association between body mass index (BMI)
and all-cause mortality in patients with type 2 diabetes.
METHODS:
Discussions with clinicians and logistic regression analyses were used to determine the
likely mechanisms of missingness and the relative appropriateness of a selection of
missing data methods, such as multiple imputation. Sequentially more complicated
imputation approaches were used to handle missing data. Cox proportional hazard
model coefficients for the association between BMI and all-cause mortality were
compared for each missing data method. Age-standardised mortality rates by categories
of BMI at around the time of diagnosis were also presented.
RESULTS:
There were 66,472 patients diagnosed with type 2 DM between 2004 and 2008. Of these
patients, 21% of patients did not have a recording of BMI at time of diagnosis.
Amongst patients with complete BMI data, there were 5,491 deaths during 296,584
person years of follow-up. Amongst patients with incomplete data, there were 2,090
deaths during 79,067 person-years of follow-up. Analyses indicated that the primary
mechanism of missingness was missing at random, conditional on patient year of
diagnosis and vital status. In particular, patients with missing data had considerably
worse survival than patients without missing data. Regardless of the method for
handling the missing data, a U-shaped relationship between BMI and mortality was
observed. Compared to complete case analysis, the association between BMI and alliii
cause mortality was weaker using multiple imputation approaches with estimates
moving towards the null. Closest observation imputation had the smallest effect on
estimates compared to complete case analysis.
Risk of mortality was consistently highest in the less than 25kg/m² BMI group. For
example, estimates obtained using multiple imputation using chained equations
indicated that patients with a BMI below 25kg/m² had a 38% higher risk of mortality
than patients in the 25 to less than 30kg/m² BMI category.
CONCLUSIONS:
Alternative methods to complete case analysis can be computationally intensive with
many important practical considerations. However, it remains valuable to explore the
robustness of estimates to departures from the assumptions made by complete case
analysis. The use of these methods can preserve the sample size and therefore may be
useful in developing risk prediction scores.
Mortality was lowest amongst overweight or obese patients relative to normal weight.
Further work is required to identify optimal approaches to weight management amongst
patients with diabetes.
en
dc.identifier.uri
http://hdl.handle.net/1842/21078
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.subject
missing data
en
dc.subject
diabetes
en
dc.subject
obesity
en
dc.subject
mortality
en
dc.title
Applying missing data methods to routine data using the example of a population-based register of patients with diabetes
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Read2015.pdf
- Size:
- 7.89 MB
- Format:
- Adobe Portable Document Format
This item appears in the following Collection(s)

