Genome-wide identification of non-canonical targets of messenger RNA synthesis and turnover factors in Saccharomyces cerevisiae
Tuck, Alex Charles
Pervasive transcription is widespread amongst eukaryotic genomes, and produces long noncoding RNAs (lncRNAs) in addition to classically annotated transcripts such as messenger RNAs (mRNAs). LncRNAs are heterogeneous in length and map to intergenic regions or overlap with annotated genes. Analogous to mRNAs, lncRNAs are transcribed by RNA polymerase II, regulated by common transcription factors, and possess 5’ caps and perhaps 3’ poly(A) tails. However, lncRNAs perform distinct functions, acting as scaffolds for ribonucleoprotein complexes or directing proteins to nucleic acid targets. The act of transcribing a lncRNA can also affect the local chromatin environment. Furthermore, whereas mRNAs are predominantly turned over in the cytoplasm, both nuclear and cytoplasmic pathways reportedly participate in lncRNA degradation. In this study, I address the question of when and how lncRNAs and mRNAs are distinguished in the cell. Messenger RNAs interact with a defined series of protein factors governing their production, processing and decay, and I hypothesised that lncRNAs might be similarly regulated. I therefore sought to determine which mRNA-binding proteins, if any, also bind lncRNAs. I reasoned that this would reveal the point at which lncRNAs and mRNAs diverge, and how differences in their biogenesis and turnover equip them for different roles. I selected factors from key stages of mRNA metabolism in Saccharomyces cerevisiae, and identified their transcriptome-wide targets using CRAC (crosslinking and analysis of cDNAs). CRAC can detect interactions with low abundance transcripts under physiological conditions, and reveal where within each transcript a protein is bound. Analyses of binding sites in mature mRNAs and intron-containing pre-mRNAs revealed the order in which the tested factors interact with mRNAs, and which region they bind. The poly(A)-binding protein Nab2 bound throughout mRNAs, consistent with an architectural role, whereas the cytoplasmic decay factors Xrn1 and Ski2 bound to poly(A) tails, which might act as hubs to coordinate turnover. The RNA packaging factors Tho2 and Gbp2, and nuclear surveillance factors Mtr4 and Trf4 bound abundantly to intron-containing premRNAs, indicating that they act during or shortly after transcription. The tested factors bound lncRNAs to various extents. LncRNA binding was most abundant for Mtr4 and Trf4, moderate for Tho2, Gbp2, the cap binding complex component Sto1, and the 3’ end processing factors Nab2, Hrp1 and Pab1, and lowest for Xrn1, Ski2 and the export receptor Mex67. This suggests that early events in lncRNA and mRNA biogenesis are similar, but unlike mRNAs, most lncRNAs are retained and degraded in the nucleus. Analyses of two documented classes of lncRNA, cryptic unstable transcripts (CUTs) and stable unannotated transcripts (SUTs), revealed some differences. SUTs were most similar to mRNAs, with canonical cleavage and polyadenylation signals flanking their 3’ ends, and poly(A) tails bound by the poly(A)-binding protein Pab1. CUTs lacked these characteristics, and in comparison to SUTs bound more abundantly to Mtr4 and Trf4 and less so to Ski2, Xrn1 and Mex67. Furthermore, CUTs accumulated upon Hrp1 depletion, suggesting that Hrp1 functions non-canonically to promote CUT turnover. Mtr4, Trf4 and Nab2 also bound abundantly to promoter-proximal RNA fragments generated from ~1000 protein coding genes. These fragments possessed short oligo(A) tails (hallmarks of nuclear surveillance substrates), were not bound to cytoplasmic factors, and apparently correspond to a population of ~150-200 nt promoter-proximal lncRNAs. Notably, CRAC analyses of Mtr4 and Sto1 targets in yeast subjected to a media shift revealed widespread changes in the abundance and surveillance of mRNAs, promoter-proximal transcripts and CUTs, which at many loci were arranged in a complex transcriptional architecture. Overall, the transcriptome-wide binding analyses presented here reveal that lncRNAs diverge from mRNAs prior to export, and are predominantly retained in the nucleus. Transcript fate is apparently determined during 3’ end processing, with CUTs diverging from mRNAs early in transcription via a distinct termination pathway coupled to rapid turnover, and SUTs diverging during or shortly after cleavage and polyadenylation, making them more stable and perhaps prone to escape to the cytoplasm. Promoter-proximal transcripts might arise from termination associated with an early checkpoint in Pol II transcription. The diverse behaviours of lncRNAs arise from their association with distinct subsets of RNA binding proteins, some of which perform different roles when bound to different types of transcript. In conclusion, my results provide the foundation for a mechanistic understanding of how distinct classes of non-coding Pol II transcripts are produced, and how they can perform diverse functions throughout the nucleus.