Edinburgh Research Archive

Machine learning for livestock disease surveillance

Item Status

Embargo End Date

Authors

Stanski, Kajetan

Abstract

Machine learning (ML) methods are forecasted to increase United Kingdom (UK)’s economy by over £200bn by 2030 and one of the sectors which will see the most growth is animal health and surveillance. This PhD project aims to explore possibilities to apply ML to better support decision making in livestock disease surveillance and assess ML models as a way to accurately predict future disease breakdowns. Bovine tuberculosis (bTB) is an exemplar livestock disease in this project and it was chosen because of its negative economic impact in UK and globally and due to availability of bTB surveillance data. Despite decades of control efforts, bTB has not been controlled in UK and currently costs ~£100m annually. Critical in the failure of control efforts has been the lack of a sufficiently sensitive diagnostic test. In the current bTB control programme, the test results inform disease control measures both at herd-level (between-farm animal movement restrictions) and at animal-level (follow up testing and culling of infected animals). Increasing the test sensitivity can therefore improve the efficacy of mitigation activities and reduce impact of bTB on the cattle industry. In this project, ML was used to predict herd-level bTB breakdowns in Great Britain (GB) with the aim of improving herd-level diagnostic sensitivity. The results of routinely-collected herd-level tests were correlated with risk factor data: bTB breakdown history, between-farm cattle movements and geographical locations of the farms. Four ML methods were independently trained with data from 2012–2014 including ~4,700 positive herd-level test results annually. The best model’s performance was compared to the observed sensitivity and specificity of the herd-level test calculated on the 2015 data. The best ML algorithm showed a high predictivity of bTB infection, with an area under receiver operating characteristic curve (AUC) of 0.907. Once compared with the currently used interpretation of the test, the ML algorithm increased herd-level sensitivity from 61.3% to 67.6% and herd-level specificity from 90.5% to 92.3%. This approach can improve predictive capability for herd-level bTB and support disease control. Individual animal-level bTB breakdown predictions are complimentary to the herd-level ones as they can inform animal-level disease control measures such as scheduling follow up tests. In this project, data from the bTB surveillance programme implemented in the Republic of Ireland from 2016–2018 was used to develop ML models predicting bTB breakdowns at the individual animal level. The models were offered more risk factors (age, sex and breed of an animal) comparing to the previous models operating at the herd level. The models were evaluated as a way to combine the bTB risk factors to provide a probability of the animal becoming infected after a negative diagnostic test. The AUC of 0.5 for a model relying solely on the diagnostic test was improved to AUCs of 0.83 and 0.72 for the ML models making predictions 183 days into the future and 365 days into the future, respectively. Input variable importance analysis showed that the herd-level input variables were more important than animal-level variables which suggests that the currently used herd status measures are good predictors of bTB breakdowns even on the animal-level. All between-farm British cattle movements are recorded and the structure of this network data makes it challenging to develop ML models exploiting animal movements. Between-farm movements have been linked to the spread of infectious diseases of livestock and they can therefore benefit model’s accuracy. The aim of this project was to explore how movement networks can be integrated in a ML framework to improve predictive ability. Two methods of representing farms within the movement network as vectors to be used as inputs to ML algorithms were compared: (i) computing a list of centrality measures (e.g. betweenness and closeness) for every farm; and (ii) an unsupervised graph embedding method provided by the PyTorch-BigGraph library. As exemplars, these vectors were used to build ML models to predict future Bovine Tuberculosis (bTB) breakdowns in GB and Bovine Viral Diarrhea (BVD) breakdowns in Scotland based on cattle movement and diagnostic data from 2012-2015. The graph embeddings-based ML models performed better than those based on centrality measures at predicting both bTB (balanced accuracy of 73.7% vs 66.8%) and BVD (balanced accuracy of 66.3% vs 62.3%) breakdowns. This study was a first attempt to use graph embedding to improve breakdown predictions based on an animal movement network and consequently help improve disease control plans and support decision making.