Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.
RNA Exosome Depletion Reveals Transcription Upstream of Active Human Promoters
Pascal Preker,1Jesper Nielsen,2Susanne Kammler,1*Søren Lykke-Andersen,1Marianne S. Christensen,1Christophe K. Mapendano,1Mikkel H. Schierup,2Torben Heick Jensen1
Studies have shown that the bulk of eukaryotic genomes is transcribed.Transcriptome maps are frequently updated, but low-abundanttranscripts have probably gone unnoticed. To eliminate RNA degradation,we depleted the exonucleolytic RNA exosome from human cellsand then subjected the RNA to tiling microarray analysis. Thisrevealed a class of short, polyadenylated and highly unstableRNAs. These promoter upstream transcripts (PROMPTs) are produced0.5 to 2.5 kilobases upstream of active transcription startsites. PROMPT transcription occurs in both sense and antisensedirections with respect to the downstream gene. In addition,it requires the presence of the gene promoter and is positivelycorrelated with gene activity. We propose that PROMPT transcriptionis a common characteristic of RNA polymerase II (RNAPII) transcribedgenes with a possible regulatory potential.
1 Centre for mRNP Biogenesis and Metabolism, Department of Molecular Biology, C. F. Møllers Alle, Building 1130, Aarhus University, Denmark. 2 Bioinformatics Research Center, C. F. Møllers Alle, Building 1110, Aarhus University, Denmark.
To whom correspondence should be addressed. E-mail: thj{at}mb.au.dk
Recent high-throughput analyses have revealed that >90% ofall human DNA is transcribed (1). The vast majority of thesetranscripts are noncoding, thus challenging the classical definitionof what constitutes a gene and, by association, a promoter (2–4).Furthermore, additional short-lived RNAs might have escapeddetection. With the aim of identifying such transcripts, weused RNA interference in HeLa cells to deplete hRrp40, a corecomponent of the human 3' to 5' exoribonucleolytic exosome,one of the major RNA degradation complexes (fig. S1A) (5). Thisresulted in a severe processing defect of the known exosomesubstrate 5.8S ribosomal RNA (fig. S1B), demonstrating diminishedexosome function. Oligo dT-primed, double-stranded cDNA fromcells that had been treated with either a control [enhancedgreen fluorescent protein (eGFP)] or hRrp40 small interferingRNA (siRNA) was hybridized to an encyclopedia of DNA elements(ENCODE) tiling array, which covers a representative 1% of thehuman genome (1). Comparison of array data to public gene annotationsrevealed overall stabilization of mRNAs (exons in Fig. 1A),as expected. RNA from intronic and intergenic regions were largelyunaffected, with the exception of a 1.5-kb region immediatelyupstream of transcription start sites (TSSs) that was stabilized1.5-fold on average (Fig. 1A). The relative stabilization ofRNA expressed from a 500-kb region exemplifies this: Four ofthe five genes in this region display peaks of stabilized RNAupstream of their annotated promoters (Fig. 1B).
Fig. 1. PROMPTs are produced immediately upstream of annotated TSSs and are degraded by the RNA exosome. (A) Relative stabilization of RNA from hRrp40 knockdown over control cells, sorted according to annotated genomic features (http://genome.ucsc.edu/cgi-bin/hgTracks) and normalized to the total signal over the entire ENCODE region. (B) PROMPT signature of a 500-kb ENCODE region (ENr323), showing the log2 transformed hRrp40-siRNA/eGFP-siRNA signal ratio (blue track) below the location of annotated genes (red bars) with their orientation of transcription indicated by arrows. The bottom track shows hRrp40-siRNA/eGFP-siRNA signal peaks (see supporting online material). (C) RT-qPCR analysis of 10 representative PROMPT regions. HeLa cells were treated with eGFP siRNA (control) or the experimental samples hRrp40, hRrp6, hRrp44, or both hRrp6 and hRrp44, as indicated. Mean values with standard deviations from at least three experiments are shown as fold increase in RNA levels of experimental over control samples. All data were normalized to an internal control, glyceraldehyde phosphate dehydrogenase (GAPDH) mRNA. For numbering of PROMPTs, see table S4.
[View Larger Version of this Image (21K GIF file)]
To validate these results, we subjected RNA from exosome-depletedversus control cells to oligo dT-primed reverse transcriptionfollowed by quantitative polymerase chain reaction (RT-qPCR)analyses of a region upstream of 20 TSSs, all of which confirmeda statistically significant stabilization under hRrp40 knockdownconditions (Fig. 1C and fig. S2A). Depletion of an additionalexosome component (hRrp46) resulted in similar levels of stabilization,whereas depletion of other factors involved in RNA turnover(hUpf1, hXrn1, hXrn2, hDcp2, PARN) had no effect (fig. S2B),indicating that promoter upstream transcripts (PROMPTs) areexosome-specific targets. Individual depletion of hRrp6 or hRrp44,the catalytically active exosome subunits, resulted in no oronly modest stabilization. Depletion of both, however, causedlevels of stabilization comparable to that observed upon depletionof hRrp40 (Fig. 1C and fig. S2A), suggesting that hRrp6 andhRrp44 act redundantly to degrade PROMPTs. This stabilizationof PROMPTs in exosome-depleted cells is reminiscent of thatof Saccharomyces cerevisiae cryptic unstable transcripts that,like PROMPTs, are also transcribed from nongenic regions (6).
To overview the average RNA stabilization profile around all1594 annotated ENCODE TSSs, we aligned array data from the hRrp40and control knockdown experiments, as well as the ratio of thetwo, relative to each other (Fig. 2A, top). Because of the differentlevels of stabilization of exonic and intronic RNA (Fig. 1A),we only considered data derived from exonic sequences downstreamof the TSSs (fig. S3). Moreover, because many genes have multipleTSS clusters (i.e., promoters) that may confound analyses, wealso aligned array data from 64 selected genes with only onemajor TSS cluster (low-complexity genes) (Fig. 2A, bottom, andtable S1). Both alignments revealed an average RNA stabilizationprofile over a 2-kb region upstream of the TSS with a peak around–1 kb (Fig. 2A). In control cells, RNA levels are nearbackground, whereas they are greatly elevated upon hRrp40 depletion.RNA levels in the hRrp40-depleted cells drop to background levelsnearing the TSS, indicating that stabilized transcripts aredistinct from their neighboring mRNAs. Thus, PROMPTs constitutea class of unstable transcripts, and we refer to the PROMPT-encodingDNA as the "PROMPT region." Short RNAs produced around TSSshave previously been reported, most notably promoter-associatedshort RNAs, which were on average 0.5 kb on either side of theTSS (4). These are, however, physically separate from PROMPTsby several hundred base pairs (fig. S4). In contrast, a fewverified PROMPT regions show weak signs of transcriptional activityin other data sets, such as scattered cap analysis of gene expressiontags (markers of transcription initiation events) (7) and expressedsequence tags unassigned to known genomic features (fig. S5).
Fig. 2. PROMPT expression maps to 0.5 to 2.5 kb (i) upstream of TSSs, (ii) can occur in both orientations, and (iii) requires the gene promoter. (A) Composite RNA profiles upstream of all 1594 (top) or 64 low-complexity (bottom) TSSs. Raw (single-channel) data (smoothened over a 10-bp window) from hRrp40-siRNA treated cells, control (eGFP) siRNA–treated cells, and their ratio are shown as indicated. The left y axis denotes values for raw data, and the right y axis denotes the log2-transformed ratio of the raw data, scaled to center at zero. Positions in base pairs of RNA signals relative to TSSs are shown on the x axes. (B) The sense (blue)/antisense (red) directionality of selected PROMPTs was determined by RT-qPCR with gene-specific primers (1 kb upstream the TSS) in either orientation in combination with a T20VN primer that hybridizes to the 3' poly(A) tail. Fold increases relative to the lowest value in control cells (set to 1) are plotted. PROMPTs are ordered such that the one with the highest preference for sense transcription is at the top. (C) Generation of promoter-upstream transcription in nonhuman DNA. Plasmids containing the β-globin gene under control of a viral promoter (CMV) or its CMV control were transiently transfected into HeLa cells. Both constructs have an insertion of bacteriophage DNA (red bar) upstream and a strong SV40 poly(A) site (black box) downstream of the β-globin gene. RNA levels were analyzed by RT-qPCR. Read-through transcription from the β-globin promoter was measured with the use of two amplicons upstream of the DNA ("read through"). The "control" amplicon has no complementary sequence in the CMV plasmid. Values on the y axis are percentages of GAPDH mRNA levels. The dashed box in the linear plasmid representation (top, not drawn to scale) encloses the region that is deleted in the CMV construct. Mean values with standard deviations (n = 3) are shown.
[View Larger Version of this Image (42K GIF file)]
We next examined whether PROMPTs were sense or antisense relativeto the mRNA produced from the downstream positioned genes. Orientation-specificRT-qPCR performed on RNA from either hRrp40 depleted- or controlcells demonstrated that, regardless of directional preference,both sense and antisense transcripts were detectable in PROMPTregions (Fig. 2B). In the presence of actinomycin D, which inhibitsspurious synthesis of potential second-strand cDNA artifacts(8), this bidirectionality of PROMPTs was still observed (fig.S6). Moreover, both sense and antisense RNAs were stabilizedto a similar extent by hRrp40 depletion (Fig. 2B), demonstratingthat both species are exosome substrates. When aligning arraydata to the TSSs of PROMPT regions where either sense or antisenseRNA production predominates, they displayed patterns similarto the average PROMPT profile (fig. S7). Taken together, thesedata suggest a complex pattern of RNA polymerase II (RNAPII)activity in either orientation upstream of individual gene promoters.This observation was supported by nonexhaustive rapid amplificationof cDNA ends (RACE) analyses of eight PROMPT regions, whichoften reveals multiple 5' and 3' ends (fig. S8).
To investigate the requirements for transcription upstream ofpromoters, we transiently transfected HeLa cells with a plasmidcontaining the β-globin gene under control of the strongcytomegalovirus promoter (pCMV) that is preceded by 2.2 kb ofbacteriophage DNA (Fig. 2C). This resulted in transcript productionfrom the DNA, demonstrating that PROMPT-like transcriptioncan be initiated independent of the underlying DNA sequence.Transcripts arising from the DNA region cannot be read-throughproducts from transcription around the plasmid because β-globintranscript levels reach background immediately downstream ofthe transcription termination site. Again, 5'-and 3'-RACE analyseswere employed to map some transcription start- and end points,which substantiated the observation of dynamic and complex RNAPIIactivity in the region (fig. S9). Deletion of the CMV promoterresulted in the concomitant elimination of PROMPT and β-globingene transcription (Fig. 2C and fig. S9). Thus, the generationof transcripts upstream of an active gene appears to dependon the gene promoter.
To further characterize the transcriptional activity and itsorigin in PROMPT regions, we compared PROMPT patterns to RNAPIIoccupancy, transcription factor binding, and chromatin modificationsusing public data sets generated by the ENCODE project (tableS2). In two representative examples, the PROMPT region is coveredby markers of active transcription, RNAPII and acetylated histone3 (H3K9ac), whereas the transcription initiation factor TAF1peaks at the TSS (Fig. 3A). The generality of this observationwas examined by creating composite profiles of the 64 low-complexityregions encompassing PROMPT and TSS sequences. PROMPTs generallyoverlap with RNAPII, marks of active chromatin, and DNAse hypersensitivesites (9, 10), but not with peaks of transcription initiationfactors; e.g., TAF1 or E2F1 (10, 11) (Fig. 3B and fig. S10).Although this reinforces the concept of substantial transcriptionactivity upstream of bona fide genes, the TSS-restricted localizationof transcription initiation factors supports our conclusionusing CMV/CMV plasmids and argues against the presence of anindependent PROMPT promoter.
Fig. 3. PROMPT regions are actively transcribed. (A) Details of transcript levels from this study compared with previously published ChIP-chip data for PROMPT and 5' regions of two representative genes. Genomic coordinates are shown on top in numbers of base pairs. (B) Composite profiles of RNA stabilization in the PROMPT regions of 64 low-complexity TSSs displayed as in Fig. 2A and compared with the indicated data sets.
[View Larger Version of this Image (44K GIF file)]
A link between transcriptional activity in PROMPT and gene regionsis further supported by scatter plots showing a strong positivecorrelation between total average RNAPII chromatin immunoprecipitation(ChIP) signal within the first 1.5 kb up- and downstream ofall 1594 ENCODE TSSs (Fig. 4A). This relation is also evidentfrom raw RNA expression data from the hRrp40 depletion experiment(Fig. 4B). With slopes of up to 0.7, these plots indicate thattranscription activity in the PROMPT region is comparable tothat in the beginning of the gene.
Fig. 4. Overall correlation of PROMPT- and gene-expression levels. (Left) Scatter plot of RNAPII distribution as measured by ChIP-chip over all 1594 TSSs in the ENCODE region (data taken from GEO, accession number GSE6391
[NCBI GEO]
). Data were integrated over 1.5 kb before (y axis, "PROMPT") and after (x axis, "Gene Start") each TSS and plotted against each other. The slope of the linear regression is 0.68 with a P value of 10–300 (t test, product-moment correlation) and an r2 value of 0.61 [degrees of freedom (df) = 1511]. (Right) Scatter plot of single-channel RNA microarray signals from hRrp40 siRNA-treated cells created as above with the exception that, in the gene, only data corresponding to exonic DNA were used to remove exon/intron biases (fig. S3). Statistical values are slope = 0.45, P value < 10–137, and r2 = 0.39 (df = 1420).
[View Larger Version of this Image (29K GIF file)]
Given their ubiquitous nature, do PROMPTs have a function? Afew noncoding RNAs that have been reported to exert regulatoryfunctions are located in potential PROMPT regions (12, 13).Likewise, a noncoding RNA directly upstream of the sphingosine-kinase1(SPHK1) gene, which affects the methylation status of CpG dinucleotideswithin its promoter (13), is also stabilized in hRrp40 knockdowncells (fig. S11A). It is therefore interesting to note thatthe methylation level of some CpG dinucleotides within the SPHK1promoter region is increased upon hRrp40 depletion (fig. S11B).That PROMPTs more generally may affect promoter methylationis further indicated by the finding that for genes with similarexpression levels, PROMPT levels are generally higher aroundpromoters with a high CpG score (fig. S11C).
PROMPTs may arise wherever open chromatin presents itself, possiblyas the byproduct of an as yet unexplored aspect of the mechanismof gene transcription. Evolution, being an opportunistic force,may then have co-opted at least some of these PROMPTs as partof regulatory mechanisms (fig. S11). One such molecular systemcould involve the control of CpG (de)methylation, an as of nowpoorly understood process (14). An alternative, but not mutuallyexclusive, possibility is that PROMPT transcription may havea more general function by providing reservoirs of RNAPII molecules,which can facilitate rapid activation of the downstream gene,and/or by serving to alter chromatin structure. Clearly, thegenerality of the PROMPT phenomenon hints at a more complexregulatory chromatin structure around the TSS than was previouslyanticipated.
15. We thank G. Pruijn for antibodies, D. Libri, A. Sandelin, and D. Schubeler for comments on the manuscript, and D. Riishøj and K. Jürgensen for technical assistance. This work was supported by the Danish National Research Foundation and the Danish Natural Science Research Council. Microarray data have been submitted to gene expression omnibus (GEO) at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/) under accession number GSE12431.
Received for publication 1 August 2008. Accepted for publication 12 November 2008.
The editors suggest the following Related Resources on Science sites:
In Science Magazine
PERSPECTIVES
Stephen Buratowski (19 December 2008) Science322 (5909), 1804.
[DOI: 10.1126/science.1168805] |Summary »|Full Text »|PDF »
THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
Transcriptional Analysis of the Adeno-Associated Virus Integration Site.
N. Dutheil, E. Henckaerts, E. Kohlbrenner, and R. M. Linden (2009)
J. Virol.
83, 12512-12525
|Abstract »|Full Text »|PDF »
Pervasive transcription of the eukaryotic genome: functional indices and conceptual implications.
M. E. Dinger, P. P. Amaral, T. R. Mercer, and J. S. Mattick (2009)
Brief Funct Genomic Proteomic
8, 407-423
|Abstract »|Full Text »|PDF »
Establishing legitimacy and function in the new transcriptome.
H. van Bakel and T. R. Hughes (2009)
Brief Funct Genomic Proteomic
8, 424-436
|Abstract »|Full Text »|PDF »
Annotating non-coding transcription using functional genomics strategies.
A. R. R. Forrest, R. F. Abdelhamid, and P. Carninci (2009)
Brief Funct Genomic Proteomic
8, 437-443
|Abstract »|Full Text »|PDF »
Human mitochondrial RNA turnover caught in flagranti: involvement of hSuv3p helicase in RNA surveillance.
R. J. Szczesny, L. S. Borowski, L. K. Brzezniak, A. Dmochowska, K. Gewartowski, E. Bartnik, and P. P. Stepien (2009)
Nucleic Acids Res.
|Abstract »|Full Text »|PDF »
The Adenovirus E1B 55-Kilodalton and E4 Open Reading Frame 6 Proteins Limit Phosphorylation of eIF2{alpha} during the Late Phase of Infection.
Transcriptome analysis by strand-specific sequencing of complementary DNA.
D. Parkhomchuk, T. Borodina, V. Amstislavskiy, M. Banaru, L. Hallen, S. Krobitsch, H. Lehrach, and A. Soldatov (2009)
Nucleic Acids Res.
37, e123
|Abstract »|Full Text »|PDF »
Microarray analysis of cytoplasmic versus whole cell RNA reveals a considerable number of missed and false positive mRNAs.
H. W. Trask, R. Cowper-Sal-lari, M. A. Sartor, J. Gui, C. V. Heath, J. Renuka, A.-J. Higgins, P. Andrews, M. Korc, J. H. Moore, et al. (2009)
RNA
15, 1917-1928
|Abstract »|Full Text »|PDF »
Long noncoding RNAs: functional surprises from the RNA world.
J. E. Wilusz, H. Sunwoo, and D. L. Spector (2009)
Genes & Dev.
23, 1494-1504
|Abstract »|Full Text »|PDF »
DBH2H: vertebrate head-to-head gene pairs annotated at genomic and post-genomic levels.
H. Yu, F.-D. Yu, G.-Q. Zhang, X. Shen, Y.-Q. Chen, Y.-Y. Li, and Y.-X. Li (2009)
Database
2009, bap006
|Abstract »|Full Text »|PDF »
Origins and activities of the eukaryotic exosome.
S. Lykke-Andersen, D. E. Brodersen, and T. H. Jensen (2009)
J. Cell Sci.
122, 1487-1494
|Abstract »|Full Text »|PDF »
Promoter targeted small RNAs induce long-term transcriptional gene silencing in human cells.
P. G. Hawkins, S. Santoso, C. Adams, V. Anest, and K. V. Morris (2009)
Nucleic Acids Res.
37, 2984-2995
|Abstract »|Full Text »|PDF »