Edinburgh Research Archive

Use of Oxford nanopore technologies to investigate outbreaks of Shiga toxin-producing Escherichia coli O157:H7 in humans

Abstract

BACKGROUND: Shiga toxin producing Escherichia coli (STEC) O157:H7 causes severe gastrointestinal disease and haemolytic uremic syndrome. As a result, it has a major impact on public health both within the UK and worldwide. Regular outbreaks can occur due to consumption of a contaminated food product, water, direct animal contact or via person-to-person transmission. Whole genome sequencing and bioinformatics has revolutionised microbial typing over the past twenty years. However, what is less understood is the true genome diversity within an outbreak of STEC O157:H7. This thesis aims to attempt to quantify and characterise the genomic diversity within UK-based outbreaks of STEC O157:H7 using the long-read sequencing platform Oxford Nanopore Technologies with the overall aim to better understand the evolution of this zoonotic pathogen. RESULTS: With optimisation to variant calling thresholds, reference-based variant calling methods can be utilised for both Illumina and Nanopore sequencing data and are comparable for SNP typing of STEC O157:H7. When designing the assembly-based methodology it was demonstrated that assembly correction was still required when using Illumina data. Nevertheless, this led to a standardised bioinformatics method for processing Nanopore-based sequences of STEC O157:H7 genomes throughout my thesis. Shiga-toxin encoding prophages harbouring the stx2a gene were more diverse in terms of both gene content and integration site when compared to stx2c-encoding prophages. Characterisation of several STEC O157:H7 outbreaks revealed that the intra-outbreak genetic diversity of the accessory genomes is more than previously expected. Multiple copies of the same stx-encoding prophage were confirmed in two of the outbreaks. Large-chromosomal re-arrangements were also detected within outbreak strains. The large chromosomal rearrangements all appeared to be between homologous prophages within the STEC O157:H7 chromosome and were more likely to be associated with compound prophages at the terminus of the STEC O157:H7 genome than other prophages. However, the driving mechanism behind this process either random or selective still needs to be determined in vivo. Antimicrobial resistance determents where unevenly distributed throughout the population of STEC O157:H7 with sub-lineages Ib, I/II and IIc harbouring on average more AMR determinants compared to other sub-lineages. For genomes with complete assemblies and the antimicrobial resistance genes had integrated into the chromosome, those integration sites were also co-located typically prophage or prophage-like regions and plasmids. CONCLUSIONS: This thesis has highlighted the genome complexity and diversity of STEC O157:H7 that can exist at a small-scale (intra-outbreak) and large population scale that has previously been difficult to detect by short-read sequencing technologies. UK domestic sub-lineages associated with severe disease the acquisition of the stx2a-encoding prophages appears to be independent and likely from different sources whereas stx2c-encoding prophages are maintained vertically. The ability to characterise the accessory genome of STEC O157:H7 in this way is the first step to understanding the significance of these microevolutionary events. With regards to antimicrobial resistance, domestic STEC O157:H7 strains are typically susceptible with sub-lineages associated with travel harbouring more antimicrobial resistance genes. This thesis has demonstrated that characterisation and localisation of individual STEC O157:H7 genetic components, be that SNPs, genes, prophages, plasmids and up to chromosomes rearrangements is now possible through the use of long-read sequencing technologies and can be achieved on a high-throughput scale for the application within public health microbiology.