Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Science 25 January 2008:
Vol. 319. no. 5862, pp. 473 - 476
DOI: 10.1126/science.1151532

Reports

Alignment Uncertainty and Genomic Analysis

Karen M. Wong,1 Marc A. Suchard,2 John P. Huelsenbeck3*

The statistical methods applied to the analysis of genomic data do not account for uncertainty in the sequence alignment. Indeed, the alignment is treated as an observation, and all of the subsequent inferences depend on the alignment being correct. This may not have been too problematic for many phylogenetic studies, in which the gene is carefully chosen for, among other things, ease of alignment. However, in a comparative genomics study, the same statistical methods are applied repeatedly on thousands of genes, many of which will be difficult to align. Using genomic data from seven yeast species, we show that uncertainty in the alignment can lead to several problems, including different alignment methods resulting in different conclusions.

1 Section of Ecology, Behavior and Evolution, University of California, San Diego, La Jolla, CA 92093, USA.
2 Department of Biomathematics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
3 Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720, USA.

* To whom correspondence should be addressed. E-mail: johnh{at}berkeley.edu

Read the Full Text



THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
StatAlign: an extendable software package for joint Bayesian estimation of alignments and evolutionary trees.
A. Novak, I. Miklos, R. Lyngso, and J. Hein (2008)
Bioinformatics 24, 2403-2404
   Abstract »    Full Text »    PDF »
Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes.
R. A. Studer, S. Penel, L. Duret, and M. Robinson-Rechavi (2008)
Genome Res. 18, 1393-1402
   Abstract »    Full Text »    PDF »
Connect the dots: exposing hidden protein family connections from the entire sequence tree.
Y. Loewenstein and M. Linial (2008)
Bioinformatics 24, i193-i199
   Abstract »    Full Text »    PDF »
Recent developments in the MAFFT multiple sequence alignment program.
K. Katoh and H. Toh (2008)
Brief Bioinform 9, 286-298
   Abstract »    Full Text »    PDF »
Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis.
A. Loytynoja and N. Goldman (2008)
Science 320, 1632-1635
   Abstract »    Full Text »    PDF »



ADVERTISEMENT
Click Me!

ADVERTISEMENT
Click Me!

To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)