Read our COVID-19 research and news.

The happy winner. Lee Rowen of the Institute for Systems Biology won the GeneSweep pool with a guess of 25,947 genes.

Low Numbers Win GeneSweep Pool

COLD SPRING HARBOR, NEW YORK--The human genome has been sequenced, but calculating the number of genes it contains is taking more time. DNA experts have nonetheless decided that they know who made the best prediction: Lee Rowen. The Seattle sequencer and two of her colleagues were declared winners this week of GeneSweep, an informal contest begun here 3 years ago at a genome meeting at the Cold Spring Harbor Laboratory (CSHL), in which researchers tried to guess just how many protein-coding sequences it takes to make a human.

When he organized the pool, Ewan Birney of the European Bioinformatics Institute near Cambridge, U.K., was convinced that the answer would be in hand by now. Most estimates at the time turn out to have been high. Some ranged upward of 70,000 or 100,000 genes, and just a few researchers guessed the number might be half as many or even fewer.

As sequencers came closer to finishing the human genome, however, the gene count fell ever lower--and the more vexing it became to determine the actual number. For example, Birney estimates that with today's stringent criteria and improved gene-prediction programs, there might be 24,500 genes, of which 3000 might be "pseudogenes" that don't produce proteins and therefore don't count. But there's quite a bit of poorly explored DNA sequence--dubbed "dark matter"--that may contain more genes. Faced with this complexity, Birney thought he'd have to postpone picking a winner.

But then he decided that, no matter what, "we're coming up with a sub-30,000 gene count," he reported at last week's CSHL genome symposium. And it turned out only a few bettors put their money on a number that low. Rowen, a sequencer at the Institute for Systems Biology in Seattle, Washington, was the closest--predicting 25,947 in 2001. So Birney decided that she should get half of the $1200 pool. Sharing the other half of the prize were Paul Dear of the U.K. Medical Research Council, who had guessed 27,462 in 2000, and Olivier Jaillon of Genoscope in Evry, France, with 26,500 in 2002.

Rowen credits sequencer Jean Weissenbach of Genoscope with influencing her prediction; years ago, he suggested that the number would be low. "At the time, everyone nearly fell off their chair," Rowen recalls. But now it seems undeniable that humans have protein-coding gene counts close to those of Caenorhabditis elegans or Arabidopsis.

Related sites
GeneSweep rules
Birney's site