Reader Comments

Post a new comment on this article

ERM model and 'average random' trees

Posted by Graham853 on 08 Mar 2009 at 13:02 GMT

My comments refer to rooted, unlabelled binary trees.

There is a lack of clarity throughout this paper about the meanings of 'symmetrical trees' and 'average random' trees. (These terms appear in the first paragraph under Results for example.) In the caption for Figure 1, one curve is for 'totally symmetrical, random average' trees. But totally symmetrical trees are not random! For each tree size, there is just one totally symmetrical tree, that tree in which starting at the root, every split into subtrees is as even as possible. No one has suggested this totally symmetrical tree as null model for evolutionary trees. It is an extreme type of tree, like the pectinate tree, which may be sometimes produced by a stochastic model.

The author states 'On average, a totally balanced tree is also expected from Yule's equal-rates Markov model' (page 7). This is not true: the ERM model only produces a totally symmetrical tree with a very small probability for large trees. For example if there are 7 tips (and so 15 nodes in total), only 1/9 of the trees from the ERM model will be totally symmetrical, and this probability rapidly decreases with increasing tree size.

It is possible to add extinctions to an ERM model in different ways, which can produce more, or less, or the same amount of imbalance as the ERM model. Likewise a biased sampling of taxa might produce more or less imbalance, depending on the nature of the bias.

RE: ERM model and 'average random' trees

Altaba replied to Graham853 on 26 Jun 2009 at 12:59 GMT

The point is well taken, but my answer is: yes and no.

It is true that the ERM model will produce highly unbalanced trees in virtually all cases. However, these trees are generated and classified with labelled leaves. Thus, they fall somehow out of the paper's scope.

At any rate, the paper does not suggest anywhere that a null model of simulated evolution should yield perfectly symmetrical trees. This, I agree, would be rather silly.

Now, when only the topology of trees is to be considered, and so only geometrical symmetry matters, perfect symmetry remains in a central position throghout any large set of random model iterations. Starting with a simple two-taxa tree, which is symmetrical by definition, every random step will move the resulting tree in geometrical space. Over many iterations, and on average, symmetry will occupy a central position ---geometrically only.

In Figure 1, it is this geometrically average that is referred to. Obviously, the analytical expectation from Yule's model or any other ERM derivation will not fall along this line. Actually, the number of two-leaved tips ("cherries") for a perfectly symmetrical tree is n/2, n being the total number of leaves. This is not so for trees obtained from Yule's model --in this case, the expected number of "cherries" is considerably lower: n/3.

So, if one wished to use this graph to compare the performance of different tree-generating algorithms, the picture becomes a little more complex. Analytically average trees obtained by random models would fall along a line located higher in the graph.

I think this is a useful coment that surely points at further developments.

No competing interests declared.