Systematic review and meta-analysis of experimental multiple sclerosis studies
Vesterinen, Hanna Mikaela
Background: Multiple sclerosis (MS) is the most common cause of disability in young people and yet there are no interventions available which reliably alter disease progression. This is despite several decades of research using the most common animal model of multiple sclerosis, experimental autoimmune encephalomyelitis (EAE). There is now emerging evidence across the neurosciences to suggest that limited internal validity (measures to reduce bias) and external validity (e.g. using a clinically relevant animal model) may influence the translational success. Aim and objectives: To provide an unbiased summary of the scope of the literature on candidate drugs for MS tested in EAE to identify potential reasons for the failures to translate efficacy to clinical trials. My objectives were, across all of the identified publications, to: (1) describe the reporting of measures to reduce bias and to assess their impact on measures of drug efficacy; (2) assess the relationship between treatment related effects measured using different outcome measures; (3) assess the prevalence and impact of any publication bias; (4) compare findings from the above with another disease with limited translational success (Parkinson’s disease; PD). Methods: I used systematic searches of three online databases to identify relevant publications. Estimates of efficacy were extracted for neurobehavioural scores, inflammation, demyelination and axon loss. For PD experiments, we searched for dopamine agonists tested in animal models of PD with outcome assessed as change in neurobehavioural scores. I calculated normalised mean difference or standardised mean difference effect sizes and combined these in a meta-analysis using a random effects model. I used stratified meta-analysis or meta-regression to assess the extent to which different study design characteristics explained differences in reported efficacies. These characteristics included: measures to reduce bias (random allocation to group and blinded assessment of outcome), the animal species, sex, time of drug administration, route of drug administration and the number of animals per group. Publication bias was assessed using funnel plotting, Egger regression and “trim and fill”. Results: I identified 1464 publications reporting drugs tested in EAE. Reported study quality was poor: 11% reported random allocation to group, 17% reported blinded assessment of neurobehavioural outcomes, 28% reported blinded assessment of histological outcomes, and <1% reported a sample size calculation. Estimates of efficacy measured as the reduction in inflammation were substantially higher in unblinded studies (47.1% reduction (95% CI 41.8-52.4)) versus blinded studies (33.1% (25.8-40.4). Moreover, the same finding was identified for 121 publications on dopamine agonists tested in experimental PD models where efficacy was measured as change in neurobehavioural outcomes. For EAE studies we were unable to include data from 631 publications describing original research. Usually this was because the publication did not include basic details such as the number of animals in each group (115 publications), the observed variance (592) or suitable control data (49). For each category of outcome I found evidence of a substantial publication bias. Interventions were most commonly administered on or before the induction of EAE with shorter times to treatment associated with higher estimates of efficacy for the reduction in mean severity scores (a neurobehavioural outcome). Treatment related effects were found to vary across different outcome measures with the largest effect being for the reduction in axon loss. Where neurobehavioural scores and axon loss were measured in the same cohort of animals, the concordance between efficacies in these increased with later times to treatment. Conclusions: In this, the largest systematic review and meta-analysis of animal studies in any domain, I have found that a large number of publications present incomplete data. In addition, measures to reduce bias are seldom reported, the lack of which is associated with overstatements of efficacy for both a measure of drug efficacy in EAE and experimental PD studies. Translational success may have also been affected by the majority of studies administering drugs on or before EAE induction which is of limited relevance in the clinical setting where patients do not present at that stage of disease. Moreover, my analysis of the relationship between outcome measures provides empirical evidence from systematically identified studies to suggest that targeting axon loss as later time points is most strongly associated with improvements in neurobehavioural scores. Therefore drugs which are successfully able to target axon loss at these time points may offer substantial hope for clinical success. Overall, improvements in the conduct and reporting of preclinical studies are likely to improve their utility, and the prospects for translational success. While my findings relate predominately to the animal modelling of MS and PD it is likely that they also hold for other animal research.