Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.


Science 21 April 2006:
Vol. 312. no. 5772, p. 367
DOI: 10.1126/science.1124180

Technical Comments

Response to Comment on "Phylogenetic MCMC Algorithms Are Misleading on Mixtures of Trees"

Elchanan Mossel1* and Eric Vigoda2*

We presented a tree mixture in which Markov chain Monte Carlo (MCMC) methods have an exponentially slow convergence rate. We expect that many other mixture scenarios will show slow convergence. Ronquist et al. show that Metropolis-coupled MCMC (MC3) converges quickly on our mixture. However, they presented no theoretical or systematic experimental evidence determining the type of mixtures where MC3 or other methods are efficient.

1 Department of Statistics, University of California at Berkeley, Berkeley, CA 94720, USA.
2 College of Computing, Georgia Institute of Technology, Atlanta, GA 30332, USA.

* To whom correspondence should be addressed. E-mail: mossel{at}stat.berkeley.edu; E-mail: vigoda{at}cc.gatech.edu

Ronquist et al. (1) claim that our results (2) depend critically on having exactly equal mixtures, but this is not correct. For a range of proportions of the two trees, there will be multiple local maxima that are not connected by a nearest neighbor interchange (NNI) transition, and the mixing time will be exponentially slow.

We agree with Ronquist et al. that mixtures present a challenge to most phylogenetic approaches. However, there is an important difference between methods that return "Fail" when the model is specified incorrectly and methods that find an incorrect tree, especially if this incorrect tree is assigned a high "confidence value." We believe that distance-based methods like those described in (35) will not be misled by mixtures of two trees. In such cases, the methods should output "Fail" instead of any specific tree or a distribution on trees.

Ronquist et al. consider standard heuristic approaches, also suggested in (2), for overcoming the possible perils of Markov chain Monte Carlo (MCMC) algorithms on mixtures, namely multiple starting points, Metropolis-coupled MCMC, or specifying a mixture model. The experimental results reported in (1) suggest that these methods may be adequate to tackle mixtures in some scenarios.

However, the applicability of these methods on some small examples does not guarantee their success in other settings. In particular, these methods might fail for some range of branch lengths or for large trees. We believe that theoretically provable results should be weighted more heavily compared with limited experiments. We thus argue that much more theoretical and experimental work is needed before MCMC methods can be safely used in mixture settings.

Our tree mixture example was the first result on the efficiency or inefficiency of MCMC methods for phylogenetic reconstruction. Currently, there are no results showing fast convergence of MCMC methods or Metropolis-coupled MCMC for any class of examples. Even in the idealized setting where character data is generated from a pure distribution (i.e., no mixture), it is unclear whether MCMC methods are always efficient.

Building on our work, Stefankovic and Vigoda (6) recently showed refined mixture examples with slow convergence. In their example, the mixture has a common topology and only varies in the substitution rates. They also show a simple mixture example of two trees with a common topology, which generates a distribution that is identical to a mixture distribution from a different topology. Hence, no methods can determine the correct topology, not even those that infer a mixture.


References and Notes


Received for publication 5 January 2006. Accepted for publication 23 March 2006.






To Advertise     Find Products

ADVERTISEMENT

Featured Jobs

Science. ISSN 0036-8075 (print), 1095-9203 (online)