The struggle is real: dealing with real world data in clinical trials
Background The recently published High-STEACS trial (ClinicalTrials.gov Identifier: NCT01852123) was a randomised controlled trial, enrolling 48,282 patients between June 2013 and June 2016. The trial’s objective was to determine if implementation of a new high-sensitivity troponin assay would improve outcomes in patients presenting with suspected acute coronary syndrome to hospital emergency departments across Scotland. The trial was unusual in that it made use of routine electronic health care data in unconsented patients. This presented a number of data management and governance challenges in the reporting of the trial. Methods The High-STEACS trial accessed routine electronic health care data sources from ten hospitals in two NHS health boards and national datasets in Scotland. Participant data were linked across twelve distinct data sources using the participant CHI (Community Health Index) number as a unique identifier. The study had ethical and local management approval in addition to approval from the Patient Benefit and Privacy Panel approval for record linkage. Data extraction was supported by the NHS Safe Haven of each health board and all eligible patients were assigned a unique study ID prior to the removal of identifiable participant data. De-identified data from each health board were transferred to the NHS Lothian Safe Haven’s secure analysis platform, hosted by the Farr Institute of Health Informatics Research. Data were combined into a single database for statistical analyses. The combined, linked study dataset was accessible only to approved individuals. Results The reporting of the High-STEACS trial presented a number of challenges not encountered in the reporting of a conventional randomised controlled trial. The use of data from electronic health records without individual patient consent allows the research question to be addressed in the whole patient population but requires rigorous data governance processes. The data linkage process was complex – collection of the correct data during the correct timeframe from multiple data sources within a single health board was challenging. This was compounded by then needing the variables in both health boards to map directly to the final combined dataset. As well as the challenges of mapping variables across health boards, a process had to be set up to ensure that patients who may have been seen at different times in both health boards were not included in the combined dataset twice. Conclusions The High-STEACS trial is one of the first of its kind to include this number of patients via routinely collected healthcare data. The strength of this approach was the ability to identify all patients with suspected acute coronary syndrome, rather than limiting findings to a selected, possibly unrepresentative, group as would be the case for a conventional randomised controlled trial. Access to routine healthcare data enabled the creation of a ‘meta database’ by drawing on a number of raw NHS data sources and linking them in a secure manner. It is important to recognise the challenges in processing and ensuring accuracy of these large datasets while meeting governance and privacy standards required for unconsented data.
The following license files are associated with this item: