Reader Comments

Post a new comment on this article

Referee comments: Referee 2 (Barbara Holland)

Posted by PLOS_ONE_Group on 10 Apr 2008 at 15:43 GMT

Referee 2's review (Barbara Holland):

**********
N.B. These are the comments made by the referee when reviewing an earlier version of this paper. Prior to publication the manuscript has been revised in light of these comments and to address other editorial requirements.
**********

Review of the original submission:
Woolley et al conduct a simulation study to assess how well various algorithms do at reconstructing evolutionary histories in an intraspecies setting where some recombination may occur.

In traditional phylogenetic simulation studies accuracy of the tree estimation methods is typically measured by one of two methods. You either ask
Is the whole tree recovered correctly?,
or
What proportion of the edges (splits) are recovered correctly?

Things get much more complicated when you compare networks with networks - I suspect that it's a topic that several papers could be written about in its own right.

The authors consider networks as lists of unique trees (where a network that happens to be a tree is just a list of length one). Their measures of accuracy and precision compare the list of trees implied by the network T the data was generated on to the list of trees implied by the network N reconstructed by some method of interest.

I think this is an interesting paper that is worthy of publication but that it could be improved by more careful choice (and motivation of the choice) of measures of accuracy.

Comments:

The introduction
1. Vriesendorp and Bakker had a review of network method in Taxon in 2005, I know that Vriesendorp conducted a simulation study of various network methods as part of her thesis, so it might be worth enquiring if this has appeared (or is about to apper) in a journal.

2. The authors talk about measuring accuracy sensu [8], it would be worth expanding on this a bit here as the measurement of accuracy is really central to this paper.


Materials and Methods - data simulation
3. I think it would be clearer to say in the text that you are using the Jukes Cantor model with a gamma distribution of rates across sites, instead of just citing the papers.

Materials and Methods - measures of performance
4. When discussing previous work in this area it might be worth citing Holland et al 2007 (in Syst Biol) who measure accuracy of Q-imputation networks by considering type I and type II errors: splits that appear in the reconstructed network but not in the generating tree; and splits that appear in the generating tree but not in the reconstructed network.

5. Was the idea of comparing lists of splits ever considered? Each of the model phylogenies embeds one or more trees which each display a set of splits. In the case of SD, NN, MED, RMD (and NJ and MP) the splits systems generated by the methods could be directly compared to the splits of the model phylogeny.

6. Split systems don't have unique visual representations. For the methods CD, NN, MED and RMD is it clear that two different representations of the same split system must always embed the same trees?

7. I think there is a typo on page 6, FTO is defined as the fraction of topologies in N that exist in T, but I think it should be the fraction of topologies in T that exist in N. Otherwise the following definition of CTO doesn't make sense. The same applies for FTR and CTR.

8. It might be useful to motivate the discussion of these measures in terms of true/false positives and negatives. Then you could talk about accuracy as being
True positives/ (true positives + false negatives), and precision as being true positives/(true positives + false positives).

9. The FTR and CTR measures only consider a tree to be right if the branch lengths are exactly correct. These measures seem too extreme to be very useful. They don't do a good job of discriminating between the different methods for the situation with high recombination - all the methods have accuracy zero! I suggest that the authors scrap these measures and replace them with something that captures the fact that some branch lengths can be closer to the truth than others without either being exactly right.

10. Page 7, the authors could be more explicit about what they mean by the word resolution in the nework context, preumably the fewer trees a network embeds the more resoloved it is. This is sensible, but worth emphasising more as it is possibly a bit counter-intuitive if you are used to the tree context where more branches means more resolution. How are multifurcations treated in this context? I.e. if MP returned the star tree would this be considered to embed all possible trees in the same way that a fully-connected network does? I think that Wilkinson and Thorley discuss some of these ideas in their 2004 Syst. Biol. paper on the information content of trees.

Results
11. pg 9, A minor point but why not use the more standard rho for the correlation?

12. pg 9, Sometimes you use mean and sometimes average, may as well pick one and be consistent.

13. page 10, when refering to the data sets by number and a reference to Table 1 where they are described.

14. Instead of saying NN became useless at higher recombination rates it would be fairer to say that it became very imprecise.

15. I think it would be better to talk about accuracy and inaccuracy w.r.t. the FTO and FTR measures and presicion/imprecision w.r.t. the CRO and CTR measures.

Review of the first revised manuscript:
I think this an important paper that should hopefully stimulate research into designing better network methods - they are clearly needed!

A couple of minor typos I noticed

pg 10 line 11 from the bottom: do -> due
pg 12 line 6 from the bottom: lease -> least