Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Science 3 May 2002:
Vol. 296. no. 5569, pp. 916 - 919
DOI: 10.1126/science.1068597


Abstract
Full Text
Large-Scale Transcriptional Activity in Chromosomes 21 and 22
Philipp Kapranov, Simon E. Cawley, Jorg Drenkow, Stefan Bekiranov, Robert L. Strausberg, Stephen P. A. Fodor, and Thomas R. Gingeras

Supplementary Material

S1. The following human cell lines used in the study: A-375 (melanoma, ATCC no CRL-1619); CCRF-CEM (acute lymphoblastic leukemia; T lymphoblast); COLO 205 (colorectal adenocarcinoma, ATCC no. CCL-222); FHs 738Lu (normal fetal lung fibroblasts, ATCC no. HTB-157); HepG2 (hepatoblastoma, ATCC no. HB-8065); Jurkat (acute T cell leukemia); NCCIT (teratocarcinoma, ATCC no. CRL-2073); NIH:OVCAR-3 (ovarian adenocarcinoma, ATCC no. HTB-161); PC3 (prostate adenocarcinoma, ATCC no. CRL-1435); SK-N-AS (neuroblastoma, ATCC no. CRL-2137); U-87 MG (astrocytoma, ATCC no. HTB-14). Jurkat and CCRF-CEM were obtained from Dr. Jacques Corbeil, Center for AIDS Research and Veterans Medical Research Foundation, University of California San Diego.

Separation of the RNAs present in the nucleus and cytoplasm was evaluated using commercially available high-density oligonucleotide arrays. Total RNA derived from cytosolic or nuclear fractions of each cell line was converted into single-stranded cDNA using random primers, fragmented with DNAse I and end-labeled with terminal transferase as described below without the second strand cDNA synthesis. This cDNA was hybridized to Affymetrix HG_U-95A arrays in duplicate experiments. Probe set 38446_at selected to interrogate the X-chromosome inactivation gene (Xist) present on the Hu 95A arrays (Affymetrix) was used to test the quality of the nuclear/cytoplasmic separation techniques. Analysis of nuclear and cytoplasmic RNA fractions from Jurkat, CCRF-CEM, SK-N-AS, A375, HepG2, NCCIT and FHs 738Lu cell lines indicated that expression of the Xist gene was detected only in the nuclear RNA fraction of the female derived CCRF-CEM, SK-N-AS and A375 cell lines. Expression of this gene was not detected in the nuclear fraction of male derived cell lines nor in the cytoplamsic RNAs obtained from any of the cell lines (data not shown). In addition, a number of cDNAs of unknown functions containing LINE, HERV and other types of repeats as well as unique regions were frequently detected in the nuclear, but not the cytosolic fraction in various cell lines (data not shown). Furthermore, separations of nuclear and cytoplasmic RNA compartments allowed for the enrichment of low copy number RNAs. An increase in the detection of the expression of approximately 10-20% of total genes could be observed after RNA enrichment that accompanied nuclear and cytoplasmic fractionation.

Total cytosolic RNA and its polyA+ fraction were prepared using RNeasy and Oligotex kits (Qiagen) following the manufacturer's instructions. mRNA was mixed with random hexamers (83.3 ng/ Greek Letter Mug of mRNA; Life Technologies) and the bacterial control transcripts (see below) and subjected to the following cycling conditions in PE GeneAmp9600 PCR System: 70°C- 10 min and 10min ramp to 25°C after which the 5x Superscript II First Strand buffer (Life Technologies), DTT and four dNTPs were added to the following final concentrations of 1x, 10mM and 0.5mM, respectively, followed by a 10 min incubation at 25°C. At this point, Superscript II RTase was added (200Units/ Greek Letter Mug of mRNA; Life Technologies) followed by a 10 min ramp to 42°C and 60 min incubation at 42°C. The volume of the first strand cDNA synthesis reaction was 20 Greek Letter Mul per every 3 Greek Letter Mug of mRNA. After inactivation of the RTase for 15 min at 70°C, the first strand cDNA was split in 20 Greek Letter Mul aliquots and used as a template for the second strand cDNA synthesis using conditions described in the SuperScript Choice System for cDNA synthesis Manual (Life Technologies). After the second strand synthesis reaction, the mRNA template was degraded using a combination of RNAseA/T1 cocktail (Ambion) and RNAse H (Life Technologies). The second-strand synthesis reactions from each cell-line were pooled, purified using QIAquick PCR purification kit (Qiagen), ethanol-precipitated and subjected to a limited DNAse I (Epicenter Technologies) digest to generate fragments of 50-100 bp. The cDNA was labeled in 70 Greek Letter Mul using 100 units of terminal transferase (Roche) and 71.4 Greek Letter MuM of Biotin-N6-ddATP for 2 hrs at 37°C, after which it was directly used for hybridization in the following mixture: 30mM MES (Sigma M-2933); 74mM MESDot SymbolNa (Sigma M-3058); 3M Tetramethylammonium chloride (Sigma T-3411); 0.1mg/ml herring sperm DNA (Life Technologies); 0.02% Triton X-100; 1X Eukaryotic Hybridization Controls (Affymetrix), 0.05nM biotinylated control oligos 948 or 213 (Affymetrix). Typically, 1-2 Greek Letter Mu g of double-stranded labeled cDNA was used per hybridization on to arrays which contained feature measuring 14 x 14 microns. The chips were hybridized 16-18 hours at 45°C. Washing was done using the antibody amplification protocol as described in the Affymetrix Expression Analysis Technical Manual. Chips were scanned on Affymetrix Gene Array scanner using the highest PMT settings and 2 Greek Letter Mum pixel. Each sample was hybridized in triplicates.

Since the cDNAs copied from RNA from this sub-fraction were labeled and used as targets for the arrays, careful attention was paid to the removal of possible contamination of genomic DNA. As a control, cytosolic polyA+ RNA from NCCIT and COLO 205 cell lines was treated with RNase-free DNAse I (2 Units/Greek Letter Mug of mRNA; Roche) in presence of 10mM Tris-acetate (pH7.5), 10mM magnesium acetate, 50mM potassium acetate, 1Unit/Greek Letter Mul ANTI-RNAse (Ambion) for 1hour at 37°C. As a control for DNAse I digest, the reaction was spiked with the control DNAs (1ng/Greek Letter Mug of mRNA) corresponding to the plasmids containing the following segments from each of the three bacterial controls LYS 328-1344, PHE 2016-3331, THR 247-2231 (see below for full description of these control genes). After DNAse I digest, the mRNA was purified by phenol/chloroform extraction and ethanol precipitation and used for cDNA synthesis and hybridization to the Chrom21_22 and DGCR arrays as described above. The number of the probes hybridizing within the known exons and outside of annotated regions was calculated and found not to be significantly different to these from the corresponding untreated samples (data not shown). As an additional control for genomic DNA contamination, total cytosolic RNA and its polyA+ fraction was pre-treated with DNAse-free RNAse (Roche) prior to RT-PCR reactions.


S2. The sequences selected were intended to minimize potential cross hybridization of characterized expressed transcripts, duplicated sequences of chromosomes 21 and 22 and repeated/low complexity sequences. To accomplish these goals oligonucleotide probe sequences were selected using empirically based rules developed at Affymetrix and pruned against the Unigene 95 database and chromosome 21 and 22 sequences for potential full or partial homologues. Candidate probe sequences residing in known repeat/low complexity regions were identified using Repeat Masker (http://repeatmasker.genome.washington.edu/RM/RepeatMasker.html) and rejected. Each probe pair on the Chrom 21_22 array interrogated the non-repeat genomic sequences on average by 35 bases. Further measures were taken to protect against cross hybridization of unintended target transcripts by using conservative use of the MM value, which are intended to measure the cross hybridization levels (S3).


S3. A probe pair with background-subtracted perfect match intensity PM and mismatch intensity MM is called positive if the ratio PM/MM exceeded some ratio threshold R and the difference PM-MM exceeded a difference threshold D, otherwise it is termed negative. Varying the thresholds yields different levels of sensitivity and specificity. Maps were generated using R in the range 1.1 through 1.5, and D in the range 4Q through 12Q, where Q, the pixel variation within features belonging to the 2nd percentile value of probe intensities for the chip, is an estimate of noise variation.


S4. Maps were improved by taking into account local probe behavior in a heuristic two-step process. In the first pass, runs of negative probe pairs in between positive probe pairs were re-classified as positive if the length of the negative probe run was at most maxgap bases in length. In the second pass, runs of positive probe pairs of length less than minrun bases were reclassified as negative. The effect of the steps is to reduce the false negative and false positive rates. The values of maxgap and minrun used were 5 and 20 respectively


S5. By fixing the R and D thresholds for any cell line experiment it was possible to calculate false positive (FP), specificity (Sp) and sensitivity (Sn) rates. Bacterial RNA transcripts containing specific sequence deletions were placed each in each polyA+ RNA sample. The following Bacillus subtilis genes/operons were used to estimate the FP rate: lys (LYS, 1612 bp, Acc. No. X17013); spo0B, obg, pheB, pheA (PHE, 3360 bp, Acc. No. M24537), thrC, thrB (THR, 2400 bp, Acc. No. X04603); jojC-birA (DAP, 6540 bp, Acc. No. L38424); trp operon (TRP, 2525 bp, Acc. No. K01391: bp. 1883-4404). The entire sequences of these loci were tiled on the DGCR chip. For the Chrom 21_22 arrays, probes were picked ~ every 30bp from the following regions of each gene/locus used: LYS 328-1344; PHE 2016-3331; THR 247-2231; DAP 1357-3196; TRP 1-2517 using identical probes selection rules as for the rest of the genomic sequences. A polyadenylated transcript corresponding to a smaller portion of each five loci was generated to evaluate the sensitivity of the assay, while the bacterial region outside of the spiked regions was employed in determination of the FP rates. The regions of each gene/locus corresponding to spiked transcripts are: LYS 817-1344; PHE 2852-3331; THR 1221-2231; DAP 1357-2493; TRP 1-1261. The control bacterial transcripts were spiked into human polyA+ RNA preparations before cDNA synthesis procedure at the following concentrations (copies/cell): LYS and PHE- 3; THR and DAP-10 and TRP-30, assuming 300,000 different mRNA species in a human cell and the size of an average transcript is 1300 nt.

False negative (FN) and sensitivity (Sn) rates for these array experiments were estimated by using the present segments of the spiked bacterial RNA control transcripts, and for the DGCR array, exon sequences determined to be present in the polyA+ RNA samples extracted from each cell line by means of reverse transcriptase-mediated PCR (RT-PCR) amplification assays. A total of 52/99 exon regions were detected as being present in the extracted poly A+ RNA from each of three cell lines (A-375, HepG2, SK-N-AS). From these experiments, it was also possible to determine FP, Sn and Sp values for each cell line for a set of fixed R and D values (6). For the array interrogating each base in the chromosome 22 DGCR, Table S1A illustrates that at a 5% FP rate a range of 47-65% Sn for the bacterial control sequences and 15-26% for the human exonic RNA sequences. Table S1B provides similar data for the chrom 21_22 array experiments at fixed R and D values. These data highlight the point that use of the bacterial control sequences as controls to evaluate Sn and Sp values may result in a higher sensitivity than the use of human exonic sequences. The differences in the bacterial and human Sn values can be attributed to differences in concentrations existing between the bacterial and human targets, to the differences in the nucleotide composition and sequence of the two types of controls (human and bacterial) in terms of their interaction with competing RNA found in human cells.


S6. Maps of a certain target false positive rate were generated by fixing the maxgap, minrun and D values, then adjusting R over the range 1.1 to 1.5 until the target false positive rate was reached in the bacterial controls. If the target rate was not achieved over the specified range of R the value achieving the closest was used.


Table S1: Sensitivity and Specificity Estimates

A. DGCR (22q 11.2)1.

Cell Lines BacSp22 BacSn3 HumSn4 pct.Pos5 pct.PosUnq6
A-3750.8570.4870.16721.7214.561
CCRF-CEM0.8170.6130.22120.64211.077
COLO 2050.8200.6520.18518.7728.279
FHs 738Lu0.7750.4730.26122.87214.499
HepG20.7950.5550.24023.20315.82
Jurkat0.7830.5420.15320.0649.876
NCCIT0.8040.5450.16221.6649.584
NIH: OVCAR-30.7850.5040.24320.72110.908
PC30.7920.5590.16117.356.765
SK-N-AS0.8730.2590.10916.7089.676
U-87 MG0.8220.6410.18718.767.335

1.Estimates made at a ~5% FP rate with the exception of A-375 (FP=3%) and SK-N-AS (FP=1.4%), For each cell line D was set to 12Q and R was selected for each cell line to achieve the target FP, R ranged from 1.17-1.47 [S3-S6 (6)]. 2.Bacterial specificity, 3.Bacterial sensitivity. 4.Human Sensitivity. 5.Percent positive probes in the entire 360 kb DGCR. 6.Percent positive probes in non-repetitive sequences of the 360 kb DGCR. For the bacterial controls: the FP rate calculated as proportion of probes called positive in the regions of the bacterial controls absent in the sample; the BacSp2 was calculated from the formula TP/(TP+FP) , where TP is the number of positive probes in the present regions of the bacterial controls, FP- the number of positive probes in the deleted regions of the bacterial control and the BacSn was calculated from TP/(TP+FN) with FN being the number of negative probes in the present regions of bacterial controls. For the human DGCR region: HumSn is a fraction of probes called positive within the 52 exons or parts of exons corresponding to the known genes (DGCR6, DGCR2 exons 6-10, DGS-I, DGS-H, DGS-A, SLC25A1 exons1-4 and Clathrin) and one validated locus RP8 shown to be present in the human cell lines using RT-PCR. The exact coordinates and descriptions of the regions used to calculate the HumSn rate can be found at http://www.netaffx.com/transcriptome/.

B. Chromosomes 21-221.

Cell Lines BacSp2 BacSn BacFp pct. Pos pct. Pos Exn
A-3750.9410.7110.0460.0620.272
CCRF-CEM0.880.8610.1210.1150.44
COLO 2050.8580.8640.1480.1210.445
FHs 738Lu0.8740.7350.1170.0940.341
HepG20.8860.8590.1140.0990.386
Jurkat0.9260.7420.0610.0730.335
NCCIT0.9040.7870.0880.0860.341
NIH: OVCAR-30.860.8170.1390.1070.433
PC30.8530.8290.1510.1450.447
SK-N-AS0.9490.6460.0360.0590.234
U-87 MG0.8390.8540.170.1270.44

1.Thresholds fixed for all cell lines at R=1.3 and D=12Q (17). BacFP rate varies, see footnote to Table S1A.


Table S2: RT-PCR Verification of Array Detected Transcripts1
Region NumberNamePCR start2PCR end2 PCR lengthRT-PCRLibrary3Other Locations4Accession #
1Chr21-14148437141484656285YesNIH: OVCAR-3Unique on 21BM873316
2Chr21-44153978941540256467YesN/DUnique on 21BM873318
3Chr21-5-22133339421334037643YesHepG2Chr.11, 18BM873319
4Chr21-62132091621321771855YesHepG2Chr. 5,14BM873320
5Chr21-72147123121471568337YesHepG2UniqueBM873321
6Chr21-81177387411774085211YesHepG2Chr. 13, 17, 18BM873322
7Chr21-91160418311604877694YesHepG2Dup. on 21, Chr.18BM873323
8Chr21-101153819411538927733YesHepG2Dup.on 21, Chr.2BM873317
9Chr21-111925998919260457468HepG2UniqueBM890561
10Chr21-12-11478871114789087376HepG2UniqueBM890562
Chr21-12-21478887114789252381HepG2UniqueBM890563
11Chr21-131696942816969683255HepG2UniqueBM890564
12Chr21-141791209617912477381HepG2Multiple, strong similarity to ribosomal L37 geneBM890565
12Chr21-1539127983391290611078YesHepG2UniqueBM890566
14Chr21-161352734613527870524YesHepG2UniqueBM890567
15Chr21-17-12499764824998018370HepG2UniqueBM890568
Chr21-17-22499789824998325427HepG2UniqueBM890569
16Chr21-182663285626633045189HepG2UniqueBM890570
17Chr21-192336497323365515542YesN/DUniqueBM890571
18Chr21-202118601121186690679YesHepG2UniqueBM890572
19Chr21-211933194319332313370YesN/DUniqueBM890573
20Chr21-221848285418483206352HepG2UniqueBM890574
21Chr21-231553208115532672591HepG2UniqueBM890575
22Chr21-2417359969173612921323HepG2UniqueBM890576
23Chr21-251761133417611677343HepG2UniqueBM890577
24Chr21-261814348518143941456HepG2UniqueBM890578
25Chr21-27-11870633518706620285HepG2UniqueBM890579
Chr21-27-21870660018706858258HepG2MultipleBM890580
26Chr21-281884143718841903466HepG2UniqueBM890581
27Chr21-291963093119631365434HepG2UniqueBM890582
28Chr21-303846924238469840598HepG2UniqueBM890583
29Chr21-313923573139235996265HepG2UniqueBM890584
30Chr22 DGCR-1-11146311753194YesN/TDup. on 22 BM873324
Chr22 DGCR-1-21548615973487YesPC-3Dup. on 22 BM873325
Chr22 DGCR-1-31662717211584YesN/DDup. on 22 BM873326
31Chr22 DGCR-2-1164261164831570YesN/DUnique on 22BM873327
Chr22 DGCR-2-21621861632221036YesN/DUnique on 22BM873328
Chr22 DGCR-2-3165841166370529YesN/DUnique on 22BM873329
32Chr22 DGCR-3-2277304277569265YesNIH: OVCAR-3 and HepG2Unique on 22BM873330
33Chr22 DGCR-4-18048080863383YesN/DDup. on 22 BM873331
34Chr22-53764559537646222627YesHepG2UniqueBM890585
35Chr22-63797360537973908303HepG2UniqueBM890586
36Chr22-7-13007853130078780249HepG2UniqueBM890587
Chr22-7-23007876030079043283HepG2UniqueBM890588
Chr22-7-33007945830080259801HepG2UniqueBM890589
37Chr22-83404260534043192587HepG2Contains Alu and LTR, non-repetitive sequence uniqueBM890590
38Chr22-9-13419894834199302354YesHepG2UniqueBM890591
Chr22-9-23419968434200120436HepG2UniqueBM890592
39Chr22-102315178023152082302HepG2UniqueBM890593
40Chr22-113183816331838702539HepG2UniqueBM890594
41Chr22-122408461624084940324HepG2UniqueBM890595
42Chr22-132946301729463153136HepG2UniqueBM890596

1.Several PCR primer pairs were designed for each selected region (locus) called positive by the chip and used to query cytosolic polyA+ RNA samples from the cell lines used in the mapping experiments or cDNA libraries prepared from cytosolic polyA+ RNA. Primers were typically picked at or near positive probes or contigs (in case of the DGCR region) with a distance between forward and reverse primer on the order of 200-1000 bp. Typically, 3 to 15 primer pairs were designed for each locus, the size of which averaged ~1.4kb. For the DGCR region (Chr22 DGCR), the 5% FP (see on-line supplemental References and Notes) maps were used for primer selection. For the Chromosome 21 regions (Chr21-11 through Chr21-23), the of HepG2 cell line map with R=1.3 and D=12 was used. For all the remaining regions, a combined map of all eleven cell lines obtained with R=1.3 and D=12 was used. Coordinates of region(s) within each interrogated locus are shown where positive product(s) were detected either in the cytosolic polyA+ samples using RT-PCR or in the cDNA libraries from indicated cell lines. 2 The start and end of each such region is shown either in the coordinates of the sequence of the DGCR region tiled on the chip for the Chr22 DGCR loci or in the coordinates of the October 2000 freeze of the Golden Path sequence for the Chr21 regions. 3Positive PCR products were detected in the cDNA libraries made from indicated cell lines. 4Additional locations in the genome having sequences similar to the (RT-) PCR products as shown by the BLAT search (http://genome.cse.ucsc.edu/cgi-bin/hgBlat). In all cases in which a homologue was identified elsewhere on the genome, the (RT-) PCR products specific to sites interrogated on chromosomes 21 and 22 were observed because of chromosome 21 or 22 loci-specific SNPs. N/T- not tested; N/D- not detected.


Figure S3: Northern hybridization analyses of poly A+ cytosolic RNA obtained from 7 of the 11 cell lines (1: NIH:OVCAR-3, 2: Jurkat, 3: HepG2; 4: FHs 738Lu; 5: COLO 205; 6: CCRF-CEM; 7: A-375; 8: A-375 treated with DNAse I.). 3-5 Greek Letter Mug of cytosolic polyA+ RNA from each of the specified cell lines was loaded on the gel. The following DNA probes were used: (A) a cDNA derived from Chr22 DGCR-3-2 region and represented by bp 277304-277569 of the DGCR sequence; and cDNAs spanning entire validated regions (B) Chr22 DGCR-2-1; (C) Chr21-8 and (D) Chr22 DGCR-1-2. Each probe was labeled with [Greek Letter Alpha-32P]-dCTP (Amersham) using the random hexamer labeling kit (Roche). Filters were hybridized in 0.5M sodium phosphate buffer pH. 7.2, 1% Bovine Serum Albumin, 7% SDS at 65°C overnight. After hybridization, filters were successively washed at 65°C in 2X SSC, 0.1% SDS; 1X SSC, 0.1%SDS and 0.3X SSC, 0.1%SDS, 15 min each wash and exposed to X-ray film for 3 weeks.


Medium version | Full size version





To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)