Preclinical risk of bias assessment and PICO extraction using natural language processing
Drug development starts with preclinical studies which test the efficacy and toxicology of potential candidates in living animals, before proceeding to clinical trials examined on human subjects. Many drugs shown to be effective in preclinical animal studies fail in clinical trials, indicating the potential reproducibility issues and translation failure. To obtain less biased research findings, systematic reviews are performed to collate all relevant evidence from publications. However, systematic reviews are time-consuming and researchers have advocated the use of automation techniques to speed the process and reduce human efforts. Good progress has been made in implementing automation tools into reviews for clinical trials while the tools developed for preclinical systematic reviews are scarce. Tools for preclinical systematic reviews should be designed specifically because preclinical experiments differ from clinical trials. In this thesis, I explore natural language processing models for facilitating two stages in preclinical systematic reviews: risk of bias assessment and PICO extraction. There are a range of measures used to reduce bias in animal experiments and many checklist criteria require the reporting of those measures in publications. In the first part of the thesis, I implement several binary classification models to indicate the reporting of random allocation to groups, blinded assessment of outcome, conflict of interests, compliance of animal welfare regulations, and statement of animal exclusions in preclinical publications. I compare traditional machine learning classifiers with several text representation methods, convolutional/recurrent/hierarchical neural networks, and propose two strategies to adapt BERT models to long documents. My findings indicate that neural networks and BERT-based models achieve better performance than traditional classifiers and rule-based approaches. The attention mechanism and hierarchical architecture in neural networks do not improve performance but are useful for extracting relevant words or sentences from publications to inform users’ judgement. The advantages of the transformer structure are hindered when documents are long and computing resources are limited. In literature retrieval and citation screening of published evidence, the key elements of interest are Population, Intervention, Comparator and Outcome, which compose the framework of PICO. In the second part of the thesis, I first apply several question answering models based on attention flows and transformers to extract phrases describing intervention or method of induction of disease models from clinical abstracts and preclinical full texts. For preclinical datasets describing multiple interventions or induction methods in the full texts, I apply additional unsupervised information retrieval methods to extract relevant sentences. The question answering models achieve good performance when the text is at abstract-level and contains only one intervention or induction method, while for truncated documents with multiple PICO mentions, the performance is less satisfactory. Considering this limitation, I then collect preclinical abstracts with finer-grained PICO annotations and develop named entity recognition models for extraction of preclinical PICO elements including Species, Strain, Induction, Intervention, Comparator and Outcome. I decompose PICO extraction into two independent tasks: 1) PICO sentences classification, and 2) PICO elements detection. For PICO extraction, BERT-based models pre-trained from biomedical corpus outperform recurrent networks and the conditional probabilistic module only shows advantages in recurrent networks. Self-training strategy applied to enlarge training set from unlabelled abstracts yields better performance for PICO elements which lack enough amount of instances. Experimental results demonstrate the possibilities of facilitating preclinical risk of bias assessment and PICO extraction by natural language processing.