Why should this posting be reviewed?
See also Guidelines for Comments and Corrections.
Thank you for taking the time to flag this posting; we review flagged postings on a regular basis.close
Post Your Discussion Comment
Please follow our guidelines for comments and review our competing interests policy. Comments that do not conform to our guidelines will be promptly removed and the user account disabled. The following must be avoided:
- Remarks that could be interpreted as allegations of misconduct
- Unsupported assertions or statements
- Inflammatory or insulting language
Reader Comments (4)
Post a new comment on this article
Comment from another reviewer of this paper
Posted by deevybee on 31 May 2013 at 07:38 GMT
I was interested to see the comment by Jon Brock because I was one of the initial reviewers of this paper and I identified many of the same methodological problems, plus some other issues with data analysis. I am aware that a third reviewer also raised some of the same points. It is therefore surprising to see the paper published in PLOS One, given the journal's emphasis on methodological rigour.
The field of ERP is blighted by the fact that it yields large multidimensional datasets, giving ample opportunity for post hoc selection of measures. This has been accepted practice in the field, and I put my hand up as someone who has in the past adopted this approach. However, there is growing awareness of the problems this creates in terms of creating a literature full of non-replicable findings. Kilner (2013) has reported simulations that demonstrate how this approach readily yields false positives. This difficulty is compounded if the goal of the research is to relate the ERP to behavioural measures, and a large battery of tests is administered from which a subset is selected on the basis that they give significant results. I agree with Jon Brock that the ERP measure used by the authors in Figure 2 is not the obvious choice for a predictor, either on a priori grounds, or on the basis of results from Phase 1. The subdivision of the autism group using a median split on one of the many available measures in Phase 1 is also questionable.
The words 'reliable' and 'robust' are used by the authors to describe the associations that they report, but this does not seem justified. ERP indices are usually unreliable in individuals, particularly in young children where signal to noise ratio is often low and only a small number of trials can be averaged. If they could show that their ERP measure gives consistent results on repeated testing this would greatly increase confidence in the findings, though we would still need to see p-values of correlations adjusted for the number of statistics computed.
I appreciate that studies like this are extremely hard to do: it can take many years to recruit even a small sample of children, and to persuade families to stay with the study through to follow-up. The goal of identifying early predictors of outcome is important both theoretically and clinically. However, anyone who has worked in this area knows that data from ERP paradigms are usually messy and clearcut associations between an ERP index and behaviour are the exception rather than the rule, particularly when, as in this case, statistical power is very low.
Kilner, J. M. (2013). Bias in a common EEG and MEG statistical analysis and how to avoid it. Clinical Neurophysiology(0). doi: http://dx.doi.org/10.1016...
RE: Comment from another reviewer of this paper
Are you saying your concerns were ignored by the editor? In a way worse than has happened to you in conventional (not large-fees-charged-to-author) journals?
RE: RE: Comment from another reviewer of this paper
Yes, it's clear that various methodological concerns were ignored by the editor, though, as Jon Brock pointed out, the paper's path through the editorial process was prolonged and difficult, with a change of editor in the course of it. The reviewers were not, as I recall, recommending rejection, but they did want the authors to be far more cautious in their claims.
I should add, though, that I don't think that reviewers are always right, and so I don't think an editor countermanding reviewers is necessarily a bad thing. But, given that at least two reviewers had similar concerns, I felt it was important that readers should be aware of the criticisms, so they could judge for themselves how serious they are. It is a good feature of PLOS One that it does allow for this kind of post-publication commentary.
Reflecting on the reasons for the disagreements, I think there are two factors at play in this case. First, if you have very eminent authors arguing against reviewers, it can be difficult for an editor to decide who is right. Second, as the Kilner paper I cited notes, in the field of ERP research, post hoc data exploration is accepted practice, and so it might seem unduly harsh to object to it, when everyone is doing it. However, this is how bad practice becomes entrenched, which is why I think it is important to take a stand. The key question is whether the results reported in this paper are replicable - and given that there is a suggestion that they might be used to give a prognosis to young children with autism, that is not just of scientific interest.
I've been doing some data simulations of my own, and I hope in future to write more about this. What I've done to date bolsters my concern about a high rate of false positives arising from unconstrained data analysis - see Ioannidis (2005) p e124 (corollary 4).
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. doi: 10.1371/journal.pmed.0020124