BOSTON--When the draft human genome sequence was completed last year, a computer analysis suggested the number of genes was shockingly small. Now, an experimental approach suggests the number may actually be much closer to the early prediction of 70,000 genes, according to a presentation 16 February at the annual meeting of the American Association for the Advancement of Science, publisher of ScienceNOW.
The all-day session started innocently enough. Eric Lander of the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts, a leader of the Human Genome Project, told a standing-room-only audience that his best guess was still that there are about 32,000 human genes. That's based on annotation, the process of predicting which stretches of DNA contain genes by relying on specialized software to identify gene-like sequences in DNA.
Later in the day, Victor Velculescu mounted a small rebellion by raising the gene count. He and his colleagues, at Johns Hopkins University in Baltimore, have gone back to the lab to look for genes that annotation missed. Their technique, called serial analysis of gene expression (SAGE), works by tracking RNA molecules back to their DNA sources. After isolating RNA from various human tissues, the researchers copy it into DNA, from which they cut out a kind of genetic bar code of 10 to 20 base pairs. The vast majority of these tags are unique to a single gene. The tags can then be compared to the human genome to find out if they match up with genes discovered by annotation. Velculescu said that only half of the tags match annonated genes--evidence, they say, that annotation underestimated the human inventory genes by about half.
The reason for the disparity may be that the standard computer annotation method was largely developed for the genomes of simple (prokaryotic) organisms, not for the more complex sequences found in the genomes of humans and other eukaryotes. "We're still not very good at predicting genes in eukaryotes," said Claire Fraser of The Institute for Genomic Research in Rockville, Maryland. It's entirely possible that there could be more than 32,000 genes, and SAGE is an important approach to finding them, she says. "You absolutely have to go back into the lab and get away from the computer terminal."