1. Initial processing of raw data.
2. Pooled HL curves.
SUMMARY
Origins of replication are often identified through their ability to confer autonomous plasmid maintenance and are then confirmed by physical techniques such as two-dimensional (2-D) agarose gel electrophoresis. In principle, origin locations can also be inferred from the kinetics of replication of contiguous portions of the genome: a sequence that replicates before its neighbors must contain an origin of replication. The time in S phase at which a particular portion of the genome replicates can be determined using a variation of the Meselson-Stahl experiment (1). In this approach, yeast cells are grown for many generations in isotopically dense medium to label the DNA uniformly. The culture is arrested in late G1 phase and resuspended in isotopically light medium. Following release from the G1 block, the cells are further synchronized by a second arrest at the G1/S phase boundary, and upon release from this block, the cells enter a synchronous S phase. Chromosomal DNA from samples collected at different times in S phase are cut with a restriction enzyme and fractionated by cesium chloride density gradient centrifugation to separate the molecules carrying the two different density labels. These fractions are then hybridized with DNA probes for the genomic region of interest. The time at which a particular restriction fragment replicates can be determined by measuring the kinetics of its conversion from fully dense to hybrid density (1).
DETAILED DESCRIPTION
To map the dynamics of replication in the genome, culture samples were collected at eight times (0, 10, 14, 19, 25, 33, 44 and 60 minutes) in a synchronized S phase. Replicated HL (Heavy-Light) chromosomal DNA was separated from unreplicated HH (Heavy-Heavy) DNA by density gradient centrifugation. To determine the positions of the HL and HH DNA in the gradient and to assess the quality of the synchronization, the replication kinetics of a few representative restriction fragments were assessed as before (1) by slot blot hybridization, using a small portion of each gradient fraction (Fig. 1). The regions tested included those known to replicate at early (ARS305 (2)), intermediate (GAL3, adjacent to ARS1 (1)), or late (R11 (3)) times within S phase. Replication of these fragments was 75-80% complete by 60 minutes into S phase (Fig. 1). In addition, the kinetics of replication were similar to those observed previously (1-3).
Figure 1. Replication kinetics of marker sequences. To determine the positions of the HH and HL DNA in the CsCl gradient and to assess the quality of cell synchronization, slot blot hybridization was performed against gradient fractions using probes to regions that are known to replicate at early (ARS305 represented by solid circles) intermediate (GAL3, adjacent to ARS1, open squares) or late (R11, near ARS501, open triangles) times within the S phase. A high degree of synchronization was obtained with approximately 75 to 80% of fragments from these origins entering the heavy-light fraction by 60 minutes into S phase.
For each time point, the unreplicated and newly-replicated yeast chromosomal DNA was mixed with 100 ng of control, bacterial DNA (see below), fragmented to an average size of about 50 base pairs with DNase I and end-labeled with a biotinylated dideoxynucleotide. The HH and HL samples were then hybridized separately to high-density oligonucleotide arrays (Fig. 2). These arrays, which were originally designed for yeast gene expression analysis, contain 20 or more oligonucleotide probes for most annotated open reading frames in the yeast genome (157,112 different 25mer probes in total) and cover 21.8% of the non-repetitive regions of the yeast genome (4). Following hybridization, the biotinylated target was stained with a streptavidin-phycoerythrin conjugate and the arrays were then scanned to measure the fluorence intensity at each array element. Grids were aligned to the scanned images and the hybridization intensities for each of the elements in the grid were determined by the 75th percentile method (for each element, the value was selected at which 75% of other pixel intensities, excluding outliers, were below that value) in the Affymetrix GeneChip software package.
![]() |
| Figure 2. Scanned image of a high-density oligonucleotide array containing probes to chromosomes I-IV. The arrays contains 20 or more 25-mer oligonucleotide probes per gene, arranged in an order that generally reflects their position in the genome. In addition to probes designed to be perfectly complementary to the genome of yeast strain S288C, an equal number of probes containing a single base mismatch in the central region of the 25mer were synthesized at positions physically adjacent to the perfect match (PM) probe. These mismatch (MM) control probes are used for background subtraction. This array was hybridized with biotinylated heavy-light sample collected 25 minutes into S phase and scanned and stained as previously described (4). A and B show probes to regions on chromosomes III and IV that replicate at early (chromosome III) or late times (chromosome IV) within S phase. Not all probes hybridize with the same efficiency, but differences can be corrected by normalization. |
Since the data for a single time point were collected from scanned images of five different arrays (A-E), the hybridization intensities were normalized using bacterial control genes as well as yeast genes (ACT1 and TBP1) that are present on all arrays. The hybridization intensities were scaled affinely (i.e., a linear correlation function was applied) such that the controls had the same mean and standard deviation on each chip, thus correcting for both varying average background and signal strength. For background correction, the hybridization intensity of the mismatch probe was subtracted from the corresponding perfect match probe. The data were then normalized with respect to a genomic DNA signal obtained by hybridizing 10 µg of KK14-3a genomic DNA from an asynchronous culture collected as described previously (4). The reference experiment had the purpose of removing sequence-specific artifacts and correcting for strain differences between S288c and KK14-3a (the strain used to collect HH and HL DNA).
As a final filter, more unreliable points were discarded: background-adjusted hybridzation intensities in the top and bottom two percentile were omitted from the analysis, and points that were polymorphic relative to S228c were discarded. After this step, a total of approximately 100,000 different probes were left with an average spacing of one probe every 120 base pairs. Only 8 gaps larger than 10 kb were identified. The precise coordinates of regions of low probe density are available for downloading at Low Probe Density Regions.
Finally, in order to identify origins, the mean of the hybridization intensities for the surrounding 10 kb (5 kb on each side) was calculated for points located every 500 bases along the genome for both the HL and HH data. These data, hereafter referred to as the raw data, are available for downloading at Rawdata Files.
For each chromosomal coordinate x, a pooled HL (or % HL(total)) value was computed from the hybridization raw data (part I.1) as follows:
First, an aggregate or pooled HH value (bHH) and an aggregate HL value (bHL) was calculated by adding up the raw data for each of the eight timed samples 1 through 8:
The % HL(total) value for chromosomal coordinate x, denoted b(x), was then obtained by forming the fraction
The logic of the pooling strategy is as follows. DNA sequences that replicate early in S phase will begin accumulating in the HL fraction and be depleted from the HH fraction early in the time course. In contrast, late replicating sequences will persist in the HH fractions until late in S phase. Therefore, when the HH and HL hybridization values for any given chromosomal coordinate are summed across the timed series, the value of % HL(total) will be greater for early-replicating sequences than for late-replicating ones. Thus, % HL(total) is a measure of how early in S phase a particular sequence replicates, and is correlated with its time of half-maximal replication (trep). This intuitive conclusion was tested using data obtained from a density transfer experiment where percent replication was deduced by slot blot analysis as described previously (1). HH and HL values obtained from the slot blots for several different probes were each summed as described above, and the resulting % HL(total) values were compared to the trep values that had been computed previously for the same probes using the same slot blot data. As shown in Fig. 3, there is a tight, linear relationship between the trep values and the pooled % HL(total) values.
![]() |
| Figure 3. Correlation between trep values and pooled % HL(total) values. Trep values obtained from a density transfer experiment by slot blot hybridization of individual probe fragments were compared to % HL(total) values obtained from the same data as described in the text. |
For example, the %HL(total) data are plotted below for Chromosome VI (Fig. 4). The data and % HL(total) curve for each chromosome are available at Pooled HL Data.
|
Figure 4. A plot of % HL(total). The plot represents 538 pooled data points for chromosome VI. |
We have chosen to use this %HL(total) approach because our future microarray hybridizations will be done with pooled S phase samples. Instead of pooling HH and HL values after hybridization of samples to separate microarrays, cell samples taken at different times are pooled prior to isolating the DNA and banding it in CsCl. This strategy greatly reduces the expenditure of material (including dense isotopes and microarrays) and time, and allows parallel processing of several independent cell cultures. The procedures and algorithms worked out here can be applied to those future experiments with no further adjustments.
Go to Secondary Data Analysis.
1. R. M. McCarroll, W. L. Fangman, Cell 54, 505 (1988).
2. A. E. Reynolds, R. M. McCarroll, C. S. Newlon, W. L. Fangman, Mol Cell Biol 9, 4488 (1989).
3. B. M. Ferguson, B. J. Brewer, A. E. Reynolds, W. L. Fangman, Cell 65, 507 (1991).
4. E. A. Winzeler, D. R. Richards, A. R. Conway, A. L. Goldstein, S. Kalman, et al., Science 281, 1194 (1998).