Reader Comments

Post a new comment on this article

Referee Comments: Referee 1

Posted by PLOS_ONE_Group on 20 May 2008 at 10:36 GMT

Referee 1's review:

**********
N.B. These are the comments made by the referee when reviewing an earlier version of this paper. Prior to publication, the manuscript has been revised in light of these comments and to address other editorial requirements.
*********

The authors aim to address the issue as to what allows sympatric occurrence of closely related species. They use two sister species of non-biting midges with a largely sympatric holarctic distribution as a model system. Sampling these species in one year from 34 sites in an area of 40 km x 60 km, they first assess population genetic structure. They find that both species conform to a distinct genotype, using both nuclear and organellar markers. They subsequently correlate the relative frequency of one of the two species at each site with a set of 38 environmental variables. They find that particularly summer precipitation explains the relative occurrence of the one species.

I find several issues of the study of great concern and propose that these issues are a reason to reject the manuscript for publication. Below I deal with the major and minor issues, more or less in order of importance.

First of all, although the authors set out to address the issue as to what allows sympatric occurrence of closely related species, the study design does actually not entirely allow for this. Instead, the authors show what could explain the fact that the two sister species do NOT co-occur at equal frequencies. This is a different issue. A study that investigates co-existence of sister species, should focus on the ecology of the two species within sites, and look how the available niche space is partitioned between two species. Although this is alluded to in the discussion, it is not actually tested. The coarseness of part of the data used (i.e. at the 0.5 min level) would not allow for such detailed within-site analysis.
One aspect allowing co-existence of sister species is reproductive isolation. Although this aspect of the biology of the species is addressed in the present study, I find it not the most exciting aspect of this sister species pair, particularly since previous studies had already suggested pre-zygotic isolation in the field, and demonstrated almost complete post-zygotic isolation in the laboratory. Not finding isolation in laboratory circumstances calls for studies on isolation in nature. The opposite however (i.e. finding almost complete isolation in the lab), does not particularly call for isolation studies in nature as it would be extremely surprising if such isolation would not operate in nature.

Secondly, I have several problems with the methodology applied, listed below:
1. This study samples the study organisms only during one season. The variables that are ultimately suggested to explain the observed pattern, are variables which have been shown to vary extensively during the last decades (e.g. precipitation values). This may result in strong year-to-year differences in population dynamics of the study organisms. Sampling during one season, may simply reflect extreme environmental conditions during that year (e.g. 2003 was an extremely and unusually hot year in western Europe where the organisms were sampled). To use the sampled individuals as a proxy for stable population numbers (which is an implicit assumption in this study) may not be valid.
2. Similarly, the sampling within site is also rather meager. In the Material & Methods section it is stated that within sites a 1 m x 1 m plot was sampled. This means that there is no way of assessing within site variation. As the distribution data are subsequently used in some of the tests in the manuscript, I deem it necessary to get an idea of within-site variation in frequency of distribution.
3. The entire sampling is carried out in a 40 km x 60 km square. This is not a good sample of the distribution range of these species, which is reported to be holarctic. To test whether the variables found to explain the observed pattern are really the most determining variables, a larger part of the ranges of both species should be investigated, ideally spanning the entire distribution ranges. This would also validate better to study macroecological (i.e. climatic) variables which more likely leave a signature at a wider geographical range, than within the plot chosen by the authors. In this light it is in a way surprising that the species have a largely sympatric distribution. Surely, if within the relatively small plot used by the authors macro-ecological variables play such a big role in shaping frequencies of distribution, then - at sites where these variables take more extreme values - complete absence of either one species would be expected. This would lead to a substantial portion of non-overlapping distribution ranges.
4. This study sampled only one season within one year. Given that the climatic variables could affect different parts of the organism's life cycle differently, it would have been more informative to sample throughout the year at each life cycle. This would then have allowed for more detailed discussion as to how the variables found to be significant affect the organisms and the resulting differences in frequency of distribution.
5. The authors use the relative frequency (of only one of the two species!) to correlate with the 38 selected variables. This assumes that the distribution of that species is entirely shaped by competitive interactions. I would have found it informative to repeat the analyses using absolute numbers for both species, because the relative frequency may still be very high although the actual number of individuals may be very low. This is obviously relevant and should also be taken into account.

Should the manuscript be accepted for publication after all, I would like to raise the following issues to consider by the authors to improve readability.

The general issue raised in the paper is put in a context of rather old literature (references from 1859, 1942, 1957, 1967, 1991). Although these include some classical pieces of work, there is surely more recent literature on the subject that could be cited (e.g. the discussion around phylogenetic niche conservatism; Wiens, Evolution 2004). This should ideally be referred to in the introduction as well.

The notion that the two study species are 'frequently found together at the same sites' suggests that sympatry is the rule. This is contradictory to what is stated in Hägele (1999): "In Europe they can be observed in geographically separated areas, but sympatric populations occur rarely in Germany and The Netherlands..."
This disagreement should be addressed.

There is at present no justification in the manuscript as to why the 38 environmental variables are selected in this study. It would make sense to link these variables to aspects of the organism's biology using references. Hägele (1999) suggests for instance: "...Eco-physiological differences concern temperature and anaerobiotic resistence as well as reactions pressure. It seems as if C. thummi would prefer a more saprobiont habitat than C. piger."

The first question raised at the end of the introduction ("what is the degree of reproductive isolation in the field?") is only partly addressed by looking at genetic differentiation at the larval stage. Pre-zygotic isolation is thereby left unaddressed. This should be made clear explicitly in the question.

In the Material & Methods, I find the categorization of "Species identification and co-occurrence" under one heading confusing. These two aspects of the study do not form a natural unit.

In the Material and Methods, for the biologically meaningful climatic parameters, it is not clear from which period these are taken. Obviously, if they are linked to distribution patterns found in one year, they should ideally be taken from the relevant year when the sampling was done. Given that some are 'mean annual temperature' it is likely that these are means from a larger number of years. At least the relevant value for the year of sampling for these variables should be given (as is done for the soil characteristics which are measured in the same year as the sampling was done).

In figure 1 it would be informative to put circles around species (as inferred from DNA taxonomy).

In figure 3 I can only identify 28 pie charts, whereas in table 1 and in the text of Material and Methods 34 sampling sites are identified. Also, if a black line in a grey chart represents a mixed population (which I assume), much more than about half of the sampling sites contain co-occurring assemblages.

I am surprised by the results of the fisher exact test: looking at the pie charts, I would have expected a non-significant result. I assume that the null hypothesis is that each species occurs in a frequency of 50% at each site. I am not convinced this is a biologically meaningful hypothesis.

In the context of the discussion where it is stated that 'the two taxa conform to several species concepts' it would be interesting to involve into the discussion the species concept that was originally used to describe the two species.

The conclusion drawn in the discussion that finding the two species less often syntopically due to competitive interaction, could simply be the result of stochastic variation. Larger sample sizes are necessary to demonstrate such interactions, as well as ecological experiments. This should at least be mentioned.

I have problems with the claim that 'this study is the first to demonstrate ecological partitioning among the species pair in the field.' For this a longer term study is necessary + a closer look at how the habitat is partitioned within site, which ultimately determines how the landscape is partitioned.

In Table 1 it would be nice to see how many individuals of EACH species are found within site, not just the total number of individuals of both species added up.

I may misunderstand Table 3, but it seems that for variables such as Mean Prec Wettest Quarter the q-value is actually non-significant at the 0.05 level (q = 0.051). If I understand correctly, q is used as a correction for multiple testing. The resulting values, however, seem then to be misinterpreted as being significant.