Influence of CpG islands on chromatin structure
CpG islands (CGIs) are short GC rich sequences with a high frequency of CpGs that are associated with the active chromatin mark H3K4me3. Most occur at gene promoters and are often free of cytosine methylation. Recent work has begun to clarify the functional significance of CGIs with respect to chromatin structure and transcription. In particular, proteins associated with histone-modifying activities, such as Cfp1 and Kdm2a, bind specifically to non-methylated CGIs via their CxxC domains. For example, artificial promoterless CpG-rich sequences integrated at the 3’ UTR of genes recruit Cfp1 and generate novel peaks of H3K4me3 in mouse ES cells without apparent RNA polymerase recruitment. There is also evidence that G+C-rich DNA recruits H3K27me3, a gene silencing mark. In this thesis I am exploring the constraints on DNA sequence and genomic location that are required to impose both H3K4me3 and H3K27me3 at CGI sequences. Showing that the generation of novel peaks of H3K4me3 and H3K27me3 over a promoter-less CpG rich sequence in a gene desert region is independent of it’s location in the genome extends earlier findings. These findings suggest that shared features of the primary DNA sequence at CGIs directly influence chromatin modification. Thus CGIs are not passive footprints of other cellular mechanisms, but play an active role in setting up local chromatin structure. However, the relative contribution of CpG frequency versus G+C content remains unclear. Therefore a sequence was generated that contains low levels of CpGs, comparable to the bulk genome, but has a G+C content similar to that of CGIs (Low CpG / High G+C). When this sequence was inserted into a gene desert neither marks of H3K4me3 or H3K27me3 were formed, indicating the importance of CpGs. Surprisingly, the reverse sequence with a high CpG frequency similar to that of CGIs and a low G+C content similar to that of the bulk genome (High CpG / Low G+C) did not establish H3K4me3 or H3K27me3 either. However, it was found that this sequence becomes heavily methylated in contrast to CGI-like sequences that remained unmethylated when introduced into a gene desert. This finding suggests that a high G+C content is important for keeping CGI-like sequences methylation free. Upon insertion of this High CpG / Low G+C sequence into mouse ES cells that were devoid of the de-novo DNA methyltransferases 3a and 3b (Dnmt3a/3b -/-) both H3K4me3 and H3K27me3 marks were established at the inserted sequence. This discovery confirms the importance of CpGs for setting up local chromatin structure.