A Good SNP Is Hard to Find

Hoping to home in on genes underlying common diseases such as atherosclerosis or cancer, scientists working on the Human Genome Project have been randomly collecting DNA variations that may serve as guideposts to these genes. But findings reported in the July Nature Genetics suggest that this strategy's payoff won't be around the corner: The most relevant single-nucleotide polymorphisms (SNPs), as these single-base variations are called, are too scarce to be picked up by random sampling.

The SNP stampede began in 1997 when the National Cancer Institute started looking for SNPs in and around some 3000 confirmed and suspected cancer genes. In January 1998, the National Human Genome Research Institute launched a similar project aimed at expanding the number of SNPs from a few thousand known today to about 100,000 in the next 3 years. And in April, 10 large drug companies, the Wellcome Trust philanthropy of Britain, and a handful of academic laboratories teamed up to form the SNP Consortium, or TSC, that will create a SNP archive encompassing some 300,000 SNPs within the next 2 years. Like J. Craig Venter's sequencing factory Celera Genomics in Rockville, Maryland, TSC will be collecting random data across the entire genome.

Instead of this whole-genome approach, two teams headed by human geneticist Aravinda Chakravarti of Case Western Reserve University in Cleveland, Ohio, and by Eric Lander, director of the genome center at the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts, gathered SNPs from a set of some 200 genes related to hypertension and other multigene diseases from more than 125 individuals. Both teams found that the hottest candidates to directly influence disease susceptibility--SNPs within the coding region of genes that alter the composition of the encoded protein--are very rare in the general population. "There seems to be a strong selection against any change in protein structure. [Most of these changes] have been weeded out in the course of evolution," says Chakravarti.

What's more, Lander's study reports that about 10% of the protein-altering SNPs seem to be specific for certain subpopulations, such as Asians or African-Americans. The bottom line, says Chakravarti, is that to discover the protein-altering SNPs "you'll have to take as large and diverse a sample population as possible."