DNA isn’t the only decorated nucleic acid in the cell. Modifications to RNA molecules are much more common and are critical for regulating diverse biological processes.
By Dan Dominissini, Chuan He and Gidi Rechavi | January 1, 2016
For years, researchers described DNA and RNA as linear chains of four building blocks—the nucleotides A, G, C, and T for DNA; and A, G, C, and U for RNA. But these information molecules are much more than their core sequences. A variety of chemical modifications decorate the nucleic acids, increasing the alphabet of DNA to about a dozen known nucleotide variants. The alphabet of RNA is even more impressive, consisting of at least 140 alternative nucleotide forms. The different building blocks can affect the complementarity of the RNA molecules, alter their structure, and enable the binding of specific proteins that mediate various biochemical and cellular outcomes.
The large size of RNA’s vocabulary relative to that of DNA’s is not surprising. DNA is involved mainly with genetic information storage, while RNA molecules—mRNA, rRNA, tRNA, miRNA, and others—are engaged in diverse structural, catalytic, and regulatory activities, in addition to translating genes into proteins. RNA’s multitasking prowess, at the heart of the RNA World hypothesis implicating RNA as the first molecule of life, likely spurred the evolution of numerous modified nucleotides. This enabled the diversified complementarity and secondary structures that allow RNA species to specifically interact with other components of the cellular machinery such as DNA and proteins.
The nucleotide building blocks of RNA contain pyrimidine or purine rings, and each position of these rings can be chemically altered by the addition of various chemical groups. Most frequently, a methyl (–CH3) group is tacked on to the outside of the ring. Other chemical additions such as acetyl, isopentenyl, and threonylcarbamoyl are also found added to RNA bases.
Among the 140 modified RNA nucleotide variants identified, methylation of adenosine at the N6 position (m6A) is the most prevalent epigenetic mark in eukaryotic mRNA. Identified in bacterial rRNAs and tRNAs as early as the 1950s, this type of methylation was subsequently found in other RNA molecules, including mRNA, in animal and plant cells as well. In 1984, researchers identified a site that was specifically methylated—the 3′ untranslated region (UTR) of bovine prolactin mRNA. 1 As more sites of m6A modification were identified, a consistent pattern emerged: the methylated A is preceded by A or G and followed by C (A/G—methylated A—C).
Although the identification of m6A in RNA is 40 years old, until recently researchers lacked efficient molecular mapping and quantification methods to fully understand the functional implications of the modification. In 2012, we (D.D. and G.R.) combined the power of next-generation sequencing (NGS) with traditional antibody-mediated capture techniques to perform high-resolution transcriptome-wide mapping of m6A, an approach we termed m6A-seq. 2 Briefly, the transcriptome is randomly fragmented and an anti-m6A antibody is used to fish out the methylated RNA fragments; the m6A-containing fragments are then sequenced and aligned to the genome, thus allowing us to locate the positions of methylation marks.
Analyzing the human transcriptome in this way, we identified more than 12,000 methylated sites in mRNA molecules derived from approximately 7,000 protein-coding genes. The transcripts of most expressed genes, in a variety of cell types, were shown to be methylated, indicating that m6A modifications are widespread. In addition, about 250 noncoding RNA sequences—including well-characterized long noncoding RNAs (lncRNAs), such as the XIST transcripts that have a key role in X-chromosome inactivation—are decorated by m6A. In almost all cases, the epigenetic mark was found on adenosines embedded in the predicted A/G—methylated A—C sequence. We found that this pattern was consistently preceded by an additional purine (A or G) and followed by a uracil (U), extending the known consensus sequence to A/G—A/G—methylated A—C—U. 2
At the macro level, we found that m6A methylation sites were enriched at two distinct landmarks. The highest relative representation of m6A was found in the stop codon–3′ UTR segment of the RNA, with nearly a third of such methylation found in this sequence just beyond a gene’s coding region. Within the coding regions of the RNA molecules, m6A enrichment mapped to unusually long internal exons; 87 percent of the exonic methylation peaks were found in exons longer than 400 nucleotides. (The average human exon is only 145 nucleotides in length). This pattern of decoration of transcribed RNA suggests that m6A is involved in the mediation of splicing of long-exon transcripts. RNAs transcribed from single-isoform genes were found to be relatively undermethylated, while transcripts that are known to have multiple isoforms, determined by alternative splicing patterns, were hypermethylated. 2 Moreover, specific alternative splicing types, such as intron retention, exon skipping, and alternative first or last exon usage, were highly correlated with m6A decoration. And silencing the m6A methylating protein METTL3 affected global gene expression and alternative splicing patterns in both human and mouse cells. 2
These findings clearly indicate the importance of m6A decoration in regulating the expression of diverse transcripts. Moreover, our parallel study of the human and mouse methylome by m6A-seq has uncovered a remarkable degree of conservation in both consensus sequence and areas of enrichment, further supporting the importance of m6A function. 2 But research into understanding how m6A marks themselves are regulated, and how this affects various cellular processes, is only just beginning.
Read more here/Leia mais aqui: The Scientist