Directionality of DNA mismatch repair in escherichia coli
Hasan, A. M. Mahedi
Non-canonical base pairs that escape the proof-reading activity of the DNA polymerase emerge from DNA replication as DNA mismatches. To promote genomic integrity, these DNA mismatches are corrected by a secondary protection system, called DNA mismatch repair (MMR). Understanding the details of MMR is important for human health as defects in mismatch repair can result in cancer (e.g. hereditary nonpolyposis colorectal cancer, also known as Lynch syndrome). Being normally stochastic in nature, mismatches can emerge at random locations in a chromosome. Therefore, using a molecular tool to generate substrates for the MMR system at a defined locus has been particularly useful in my study of DNA mismatch repair in vivo. In this study, I have used a CTG•CAG repeat array, also called the “TNR array”, to generate frequent substrates for the MMR system in Escherichia coli. In E. coli, the MMR system searches for hemimethylated GATC motifs around a mismatch to initiate removal of the faulty nascent (un-methylated) strand. Analysing the usage of GATC motifs around the TNR array, I have found that the MMR system preferentially utilizes the GATC motifs on the origin distal side of the TNR array demonstrating that the bidirectionality of MMR in vitro is constrained in live cells. My results suggest that in vivo MMR operates by searching for the nearest hemimethylated GATC site located between the mismatch and the replication fork and excision of the nascent strand occurs directionally away from the fork towards the mismatch. Previous in vitro studies have established that the excision reaction during MMR terminates at a discrete point about 100 bp beyond a mismatch. However, in vivo recombination at a 275 bp tandem repeat, which has been proposed to be mediated by single stranded DNA generated during the excision reaction, has suggested that the end point of the excision reaction in live cells may extend much further from the mismatch than this. I have used this assay for extended excision to determine the influence of GATC sites on excision tracts. In this study, modification of the GATC motifs on the origin proximal side of the TNR has shown that the excision reaction does not stop at a GATC motif on the origin proximal side of the mismatch. In addition, sequential modifications of GATC motifs on the origin distal side of the TNR array, thereby shifting the start point of the excision reaction to a greater distance, have suggested that the length of an excision tract is a function of the distance it covers from the start point rather than from a mismatch. My observation of directionality with respect to DNA replication in the recognition of GATC sites suggested that MMR and DNA replication might be coupled in some way and that perhaps active (or blocked) MMR might impede the progress of the replication fork. However, no replication intermediates were detected using two-dimensional agarose gel electrophoresis of genomic DNA fragment containing the TNR array upon restriction digestion. I was therefore unable to support the hypothesis that active or blocked MMR led to a slowing down of DNA replication. Given my observation of a decrease in MMR by separating the mismatch from the closest origin distal GATC site, I set out to test whether MMR caused any selection pressure for the genomic distribution of GATC motifs. To do this, I generated artificial model genomes using a Markovian algorithm based on the nucleotide composition and codon usage in E. coli. Strikingly, the comparison of the distribution of GATC motifs in the E. coli genome with those from artificial sequences has shown that GATC motifs are distributed randomly in E. coli genome, except for a small clustering effect which has been detected for short spaced (0-40 basepairs) GATC motifs. The observed distribution of slightly over-represented GATC motifs in the E. coli genome appears to be a function of the total number of GATC motifs and it seems that the DNA mismatch repair system has evolved to utilize the natural distribution of GATC motifs to maintain genomic integrity.