A gene needs to express itself in order to contribute to cellular functions. Gene expression enables the genetic information from DNA to be transcribed into an RNA molecule. However, RNA molecules are not naked in the cells; as soon as an RNA is transcribed, it becomes coated by RNA-binding proteins to form ribonucleoprotein complexes (RNPs). These RNPs then coordinate many stages of RNA processing, quality control, transport and regulation. The RNPs oten involve dozens, if not hundreds of proteins bound to an RNA molecule.
We develop techniques that integrate biochemistry and computational biology to obtain a comprehensive map of interactions between a specific protein and its RNA partners within our cells. We developed the individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP), and a related method called hiCLIP, which reveal the conformation of RNPs across the transcriptome.
We use these methods in collaboration with the group of Nicholas Luscombe to study how the sequence and structure of RNAs defines the composition and function of RNPs.
Cells can change their gene expression by modulating the function of RNPs. Moreover, genetic studies have identified mutations that disrupt the normal function of RNPs. These mutations often cause neurologic diseases, particularly the motor neuron disease, also referred to as amyotrophic lateral sclerosis (ALS).
We study this disease in collaboration with the group of Rickie Patani by using induced pluripotent stem cells with specific genetic mutations, and differentiating them into motor neurons. We wish to understand how these mutations affect the assembly of protein-RNA complexes, thereby initiating the molecular cascade leading to disease. We study the following questions:
1) How do RNA-RNA and protein-RNA contacts define the assembly of RNPs, and thereby coordinate RNA processing and regulation?
2) How does evolution tinker with the RNA regulatory circuits? What is the role of transposable elements and non-canonical splicing in evolution?
3) How do protein-RNA complexes modulate the functions of neurons or glial cells during brain development, aging or neurodegenerative diseases?
4) How do mutations cause disease by disrupting the function of RNPs, and what treatments could ameliorate this?
And here are some of the RNA stories that we have passed through:
Understanding protein-RNA complexes.
Techniques to identify RNA binding sites.We developed individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) to quantify protein-RNA interactions in the whole transcriptome. We described the details of the iCLIP method, presented it in a video, and reviewed the transcriptomic techniques for studies of protein-RNA interactions. We also demonstrated that the cDNAs in iCLIP truncate at crosslink sites, and developed computational methods based on cDNAs-starts for analysis of iCLIP data.
Our published iCLIP sequencing data are available both as raw format (fastq file), as well as processed format on the public server iCount.
Question-answer forum on the iCLIP method
Two journals dedicated an issue in 2014 to RNA-binding proteins. An issue of Genome Biology is dedicated to RBPome, and an issue of Methods is dedicated to protein-RNA interactions, both including manuscripts on CLIP and related experimental and computational methods.
RNA maps: how does the location of RNA binding site instruct its function?We integrate transcriptomic data on protein-RNA interactions and their function, which can tell us how ribonucleoproteins (protein-RNA complexes) assemble at specific positions on their target transcripts and thereby regulate alternative splicing, mRNA decay or translation. We use iCLIP to assess where an RBP binds its target transcripts, while also assessing how this RBP controls pre-mRNA processing. Integration of these two approaches showed that most RBPs regulate alternative splicing according to genome-wide positional principles, or RNA splicing maps. For example, by integrating TIA iCLIP with its splicing analysis upon TIA knockdown, we were able to derive nucleotide-resolution RNA splicing maps of TIA proteins. We developed software (RNA motifs) that can derive RNA splicing maps by analysis of multivalent RNA motifs that are often bound by RBPs, and the web platform RNAexpress that can integrate diverse data and perform motif analyses to derive RNA maps for regulation of alternative polyadenylation and splicing.
We also collaborate with the group of Richard Jenner in studies of RBPs that bind to nascent RNA to regulate transcription and chromatin.
Gregor Rot, Zhen Wang, Matteo Cereda, Melis Kayikci, Julian König, Kathi Zarnack, Nejc Haberman, Jan Attig
RNA map gives first comprehensive understanding of alternative splicing
The web platform RNAexpress enables analysis of 3' mRNA-Seq (produced by the Lexogen QuantSeq or other methods) to identify regulated polyA sites. This can be integrated with information on alternative exons and/or RNA binding sites determined by CLIP or motif analyses to derive RNA maps for regulation of alternative polyadenylation and splicing.
Understanding the regulation and function of cryptic splicing elements.We reviewed the studies that mapped the functional binding sites of RNA-binding proteins across the transcriptome, which have uncovered an unprecedented diversity of previously unknown non-canonical splicing events. These studies identified many cryptic events located far from the currently annotated exons and unconventional splicing mechanisms that have important roles in regulating gene expression. These non-canonical splicing events are also a major source of newly emerging transcripts during evolution, especially when they involve sequences derived from transposable elements. We study RBPs that are specialised for binding to these elements, which ensures their tight regulation and quality control. While mutations perturbing binding of RBPs to these elements can disrupt gene expression and lead to diseases, we found that they are also a major driving force for the emergence of new exons during evolution. Image on the right is from the cover of the journal.
Alu-derived exonsBy identifying RNA binding sites of hnRNP C across the transcriptome with iCLIP, we have shown that hnRNP C binds to long uridine tracts, and thereby regulates splicing of alternative exons.
We have also found that hnRNP C can strongly repress inclusion of exons that are derived from Alu elements, or so-called Alu-exons. Alu retrotransposable elements are specific for primate genomes, and they have probably played an important role in the evolution of primates, since they constitute 10% of the human genome. hnRNP C represses recognition of cryptic splice sites in Alu elements by displacing the splicing factor U2AF65 from the uridine tracts. Loss of hnRNP C leads to formation of thousands of harmful exons, and mutations within uridine tracts in Alu elements can cause many human diseases.
We have shown that the positive and negative regulatory forces are tightly coupled in the evolution) of Alu-exons. In species where mutations made the splice sites of Alu-exons stronger, the uridine tracts are longer, which allows hnRNP C to act as a counteracting force. This allows the Alu-exons to remain in a harmless cryptic state over long evolutionary periods, during which they accumulate additional mutations. We hypothesise that the repressive function of hnRNP C prevents the damaging effects of immediate Alu exonization, and the length of uridine tracts represents a ‘molecular rheostat’. After Alu-exons accumulate lots of mutations, the uridine tracts gradually shorten, and as a result the Alu-exons start escaping from repression, and contributing to new cellular functions. We hypothesise that the uridine tracts in Alu elements can buy the time needed for mutations to make beneficial changes, rather than disruptive ones, during the evolution of a species.
Julian König, Kathi Zarnack, Mojca Tajnik, Jan Attig, Igor Ruiz de los Mozos, Federico Agostini
The guardian of the transcriptome
Regulating Alu element exonization
A hidden code in our DNA explains how new pieces of genes are made
Recursive splicing in long intronsLong introns contain hundreds of so-called ‘cryptic sequences’ that appear very similar to exons, but are not supposed to be used. The cellular machinery faces great challenges in distinguishing true exons from these cryptic sites. We found that cells sometimes select a cryptic exon that is present deep within a long intron, but later discard it, in a process called recursive splicing (see the paper here). Normally recursive exon removes this cryptic exon, allowing it to remain invisible. However, if the recursive site is preceded by other cryptic splicing events, then the exon is not removed – creating a ‘binary switch’ or checkpoint that can distinguish correct splicing events from the newly emerging cryptic events, which could be potentially damaging. Thus, long introns on one hand enable emergence of many cryptic splicing events during evolution, whereas recursive splicing ensures that this evolutionary tinkering does not disturb the primary mRNA that needs to be made from the gene. We observed this process happening in some of the longest genes that are expressed in human brain, which are often implicated in autism or other neurodevelopmental disorders.
Chris Sibley, Warren Emmett, Lorea Blazquez, Andrea Elser
A new genetic switch uncovered in the long genes expressed in our brain
Splicing does the two-step
In a commentary, scientists express their fascination by introns.
Understanding the secondary structure of mRNAs.The secondary structure of mRNAs has important effects on its stability and translation. To understand the in vivo structure of full-length mRNAs, developed a technique called hiCLIP to identify the connections that hook sections of an mRNA together, which are called RNA duplexes. We were amazed to find that mRNAs form thousands of such duplexes, and often these duplexes hook together very distant parts of mRNA molecules. We found that that these duplexes interact with the double-stranded RNA binding protein Staufen 1. We also found that these RNA duplexes have less genetic variation in humans than surrounding areas of the mRNA, indicating that mutations could cause disease by disrupting the structure of mRNAs.
Yoichiro Sugimoto, Christina Militti, Flora Lee
Structure of genetic messenger molecules reveals key role in diseases
hiCLIP: New method finds structures of mRNA molecules
Detailed probing of RNA structure in vivo
Understanding the mechanisms of protein-RNA complexes in brain aging and neurodegeneration.protein-RNA complexes play many important roles in the brain by regulating gene expression in many different ways, including alternative splicing, RNA transport and translation. We study the regulatory networks controlled by TDP-43 and FUS), two RNA-binding proteins that can cause amyotrophic lateral sclerosis when mutated. We showed that both proteins regulate alternative splicing of a functionally coherent set of transcripts, many of which encode proteins implicated in neurodegenerative disorders.
We have examined changes in gene expression in human postmortem brain samples to compare the effects of healthy aging with two neurodegenerative diseases: Alzheimer's disease and frontotemporal lobar degeneration (FTLD). We observed widespread changes in alternative splicing: most were specific to diseased samples, but some were common to aging and disease. Especially the changes in glial-specific genes appeared to be shared, with the decrease in oligodendrocyte-specific genes being most apparent. Therefore, we further analysed gene expression in three large cohorts of samples to examine changes in ten brain regions upon aging. Stratifying the gene expression by cell type, we found that astrocytes and oligodendrocytes diminish their regional identity upon aging. We also developed a machine learning method to analyse high-resolution images of brain sections, and applied it to a more limited number of samples. This indicates that the number of oligodendrocytes decreases, which less change is seen in the total number of neurons upon aging. However, the neurons with largest cell bodies appear to also decline in their number. On the other hand, we find dramatic increase in the expression of microglia and endothelial-specific genes in all brain regions upon aging. We hope that these findings will be of use for further studies of the cellular phase of aging and the Alzheimer's Disease. A table of the relevant genes and their differential expression upon aging in each brain region is available here.
James Tollervey, Boris Rogelj, Rickie Patani, Michael Briese, Lilach Soreq, Claire Hall, Martina Halleger, Frederique Rau
CLIPs of TDP-43 Provide a Glimpse Into Pathology, Alzheimer Research Forum
FUS and Friends: Two Studies Probe FUS’ RNA Partners
New Link Revealed Between Alzheimer's Disease and Healthy Aging
More Clues How the Brain’s ‘Other Cells’ Change As We Age
Aging Causes “Identity Crisis” in Glia
Scientists Could Identify a Person’s Age by Looking at These Cells in Their Brain
As the Brain Ages, Glial-Cell Gene Expression Changes Most
To Understand a Brain’s Age, Focus on More Than Neurons A video on Glia: Could These Brain Cells Help Explain What Causes Dementia?