Transmission networks inferred from HIV sequence data
Ragonnet-Cronin, Manon Lily
HIV in the UK in the 1980s was concentrated within men who have sex with men (MSM) and people who inject drugs (PWID) but heterosexual sex is now the most frequently reported risk behaviour. As these risk groups are associated with different virus populations, this is reflected in the subtype diversification of the UK epidemic, which was historically dominated by subtype B. I have made use of a national database of HIV sequences collected during routine clinical care, which also contains data on age, sex, route of exposure & ethnicity. The 2014 release of the UK HIV Drug Resistance Database contained data from over 60,000 patients. In this thesis, I first describe the development of novel tools that rapidly and automatically identify HIV clusters within phylogenetic trees containing tens of thousands of sequences because they represent transmission chains within the larger infected population. I use these tools to compare the HIV subtype B epidemics in the UK and Switzerland, which had both been described separately but using different approaches. Working with Swiss colleagues, I was able to analyse the epidemics in exactly the same way without having to share sensitive data. I found clustering in the UK to be much higher at relaxed thresholds than in Switzerland (34% vs 16%) indicating that the UK database is more likely to capture transmission chains. Down sampling revealed that this pattern is driven by the larger size of the UK epidemic. At tighter cluster thresholds, the epidemics were very similar. I next use these tools to analyse the spread of emerging subtypes A1, C, D and G in the UK. I found both risk group and cluster size to be predictive of cluster growth, which I tested using simulations and a GLM. Growth of MSM and crossover clusters was significantly higher than expected for subtypes A1 and C, indicating that crossover from heterosexuals to MSM has contributed to their expansion within the UK. Numbers were small for subtypes D and G but the proportion of new diagnoses linking to MSM and crossover clusters was similar to A1 and C, suggesting that the same pattern may be emerging for D and G. I conclude by evaluating the accuracy of a method previously described by our group to generate transmission networks from HIV sequences. The interpretation of clustering patterns from phylogenetic trees is difficult because of the absence of a standardised statistical framework. In contrast, a body of work exists that relates disease transmission to networks. Using large simulated datasets, I developed algorithms which eliminate improbable links. I then reconstructed improved UK transmission networks for subtypes A1, B and C and compare network metrics (such as the degree distribution) between risk groups. Together with other evidence, this thesis demonstrates that the UK HIV epidemic continues to be driven by transmission among MSM. The UK epidemic is no longer compartmentalised and the crossing over of subtypes across risk groups has been facilitated by MSM also having sex with women.
The following license files are associated with this item: