Mutation and selective constraint in the murid genome
View/ Open
Date
2006Author
Gaffney, Daniel John
Metadata
Abstract
A large proportion of the genome of many higher eukaryotes consists of apparently functionless noncoding DNA, the significance of which is a longstanding puzzle in biology. The aim of this work was to quantify the extent to which both mutation and natural selection have influenced molecular evolution in murid noncoding sequence. In particular, the magnitude of and variation in selective constraint within murid noncoding DNA was investigated. Selective constraint is defined as the proportion of all mutations occurring at a locus or site which are strongly deleterious and therefore removed by natural selection. The approach adopted to estimate selective constraint relies on the assumption that we can quantify the past strength of purifying selection in a DNA sequence by comparison with nearby regions which are assumed to be evolving neutrally. To this end, work in this thesis deals both with mutational variation and bias (Chapters 2 and 3) as well as with selective constraint (Chapters 4 and 5) in noncoding DNA. Chapter 2 is concerned with the differential effects of context-dependent mutation (namely, CpG hypermutability) at fourfold synonymous and noncoding sites. Using simulations it was shown that a common method of assigning ancestral CpG status often introduces a substantial level of bias into the estimation of nucleotide substitution rates. The effects of this bias can easily be misconstrued as the action of purifying selection at synonymous sites. Chapter 3 is concerned with mutational variation in the murid genome. Nucleotide substitution rates in murid transposable elements were estimated. It was assumed that the majority of murid transposable elements were evolving neutrally and, therefore, that their molecular evolutionary rate was dictated by mutation alone. Under this assumption, variation in estimated element substitution rates reflects sampling and mutational variation only. The results indicate that greater mutational variation occurs along the length of a chromosome than between individual chromosomes, although the latter has been the primary focus in the literature. This result illustrates the importance of accounting for mutational variation in studies of selective constraint and sequence conservation. In Chapter 4, the level of constraint in intergenic DNA adjacent to coding sequences and a moderate distance inside first introns was estimated in a sample of 300 mouse-rat gene orthologues. The results suggested that whilst selective constraint in intergenic sequence adjacent to the start and stop codons is moderately high, this becomes statistically indistinguishable from zero within 4kb upstream/downstream of the first/last exon. Selective constraint in the 5' end of the first intron was also found to be moderately high. Taking the contributions from noncoding sequence into account, it was estimated that the number of deleterious mutations occurring in murid noncoding DNA was approximately equal to that in protein-coding sequence. Chapter 5 expands on the work done in Chapter 4. The assumption of neutral evolution in non-first introns was addressed by comparing their evolutionary rates with those in transposable elements. In addition the selective constraint in intergenic DNA immediately adjacent to genes with that found large distances from known genes was compared. The results showed that, when repetitive sequence is removed, the selective constraints in intergenic DNA are significantly different from zero. Furthermore, this constraint does not become indistinguishable from zero, even at large distances ( 50kb) from genic regions. The data also showed that a weak correlation between intron length and nucleotide substitution rate exists in murid non-first introns.7