Reader Comments

Post a new comment on this article

Publishing everything is fundamentally suboptimal for scientific progress

Posted by jcfdewinter on 21 Jan 2014 at 11:06 GMT

Publishing everything is fundamentally suboptimal for scientific progress

Joost C. F. de Winter & Riender Happee



We highly appreciate the commentary article by Van Assen et al. [1] and we are honoured that they reviewed and extended our simulations.

Van Assen et al. conclude that “publishing everything is superior to selective publishing”, a conclusion which is neither new nor unexpected. Numerous writers have emphasized that selective publication has to be eliminated, and a motto of PLOS ONE is to publish everything that is technically sound.

Let us first explain ‘publishing everything’ versus ‘selective publication’ from a broad perspective. Suppose there is a room of 1,000 people, and you would like to gauge the peoples’ opinion on a particular subject. The persons in the room can use a microphone to state their opinion. What would be the most effective way to get a reliable average: (1) let 40 randomly selected persons state their opinion one by one, or (2) let someone with an extreme view speak, then let another person speak who strongly disagrees with the first person, let a third person speak who strongly disagrees with the first two, and so forth, until 40 people have given their opinion? The first approach (i.e., publish everything) will result in many similar views, whereas the second approach (i.e., selective publication) will yield extremely opposing views (as well as an exciting evening for the audience). The second approach will lead to more rapid convergence because a self-correcting mechanism is at work. Proponents of the first approach may argue that the sequential use of a microphone creates an artificial scarcity (cf. [2]) and that it is preferable to gather the opinion of all 1,000 persons in the room. However, this would be a tedious procedure and can never work perfectly because there will always be people who do not want to share their opinion.

Popper [3] stated that “there is something like a law of diminishing returns from repeated tests” (p. 240). High-school classroom replications of Newton’s Principia are usually not published in the scientific literature, because scientists and editors aspire to publish new results. This is also apparent in our simulations [4], which showed that the probability of publication diminishes when a phenomenon becomes more firmly established. Our selective publication approach was inspired by a step response (http://en.wikipedia.org/w...), which is qualitatively analogous to the Proteus phenomenon of alternating research findings. According to our simulations, publishing ‘different’ results can be effective for the cumulative growth of scientific knowledge [4].

Let us now look at the simulations themselves. Van Assen et al. reproduced our results while applying a random instead of fixed effect meta-analysis (the latter simply averages the observed effects). Their approach hardly influences the publish everything approach, but yields wide confidence intervals for the selective publication approach. Van Assen et al. “applied random-effects meta-analysis, because random-effects meta-analysis is generally recommended when the underlying population effect may be heterogeneous”. The debate about random versus fixed effects meta-analysis has a long history. The PRISMA guidelines (which are also adopted by PLOS ONE) state: “There is no consensus about whether to use fixed- or random-effects models, and both are in wide use.” [5].

When thinking critically about random versus fixed effects meta-analyses, one has to conclude that they are not competing but rather different ways of summarizing data [6]. Hedges and Vevea [6] state: “There may be situations in which the fixed-effect analysis is appropriate even when there is substantial heterogeneity of results (e.g., when the question is specifically about a particular set of studies that have already been conducted).” Selectively published research findings are erratic (i.e., the Proteus phenomenon) with a highly bi-modal distribution, because effects that are close to the true effect remain in the file drawer, whereas effects that deviate from the true effect are likely to be published [4]. Van Assen et al. are aware that the observed heterogeneity is a side effect of selective publication as they state that “the heterogeneity under selective publishing does not reflect heterogeneity of the underlying population effect size, but is an artifact of selective publishing”. The fact that artifactual variance is present can be known beforehand because meta-analysts and journal editors are aware that selective publication exists. It is not clear why Van Assen et al. use an underpowered random effects model that assumes a normal distribution, while the observed results are extremely non-normally distributed, and while knowing that the method severely overestimates the true between-study variance. Van Assen et al.’s approach is akin to using a t-test on data which is known to contain severe outliers, and then arguing that the effect that can be seen is not statistically significant because the variance is so high. Why not use the ‘best’ test in terms of Type I and Type II errors?

Our simulations as well as those by Van Assen et al. used a constant true effect of d = 0.3. Although Van Assen et al. introduced random effects meta-analysis, they did not actually test what happens when the true effect varies from study to study. It is interesting that when the true effect is variable, the selective publication approach automatically yields a higher rate of publications than when the true effect is constant. For example, if the effects have a mean of 0.3 and a standard deviation of 0.3 (i.e., simply change line 16 of our code from d=Etrue+randn/se into d=Etrue+0.3*randn+randn/se, and keep everything else the same), then not 1 in 20 studies, but 1 in 2.4 studies are published. At the 40th publication, the SD of Emeta for the publish everything approach is 0.052 and the SD for the selective publication approach is 0.043. That is, selective publication gives results that are closer to the average true effect, while at reasonable cost (40 studies for publishing everything versus 95 studies for selective publishing).

We are pleased that Van Assen et al. acknowledge that our work has three merits:
1) Our model can explain the Proteus phenomenon of rapidly alternating research findings.
2) Our model shows that selective publication yields an accurate estimate already after a few publications.
3) Our model demonstrates how biases occur when not adapting the null hypothesis.

We would like to add a fourth merit:
4) Our simulations illustrate that scientific publication can be described as a dynamic process, where researchers attempt to falsify each other’s work.

We reiterate that selective publication has certain benefits, but in situations where data is scarce and expensive, it is probably more defensible to publish all studies. We also believe that selective publishing will not work in the current scientific climate of many research fields. As long as pressure exists to report only positive effects and ignoring subsequent refutations, rules and procedures (such as pre-registration of all trials) will be needed to prevent null effects from disappearing in the file drawer. However, we believe that rules and procedures are patchwork as they do not address these root causes of research bias.

In summary, we offered a new view on the dynamics of scientific publishing, and explained the counter-intuitive idea that selective publication can yield a robust and accurate effect size estimate if scientists and editors continuously adapt their null hypothesis according to the state-of-the-art [4]. Van Assen et al. used a random effects meta-analysis to cope with heterogeneous effects, but according to us this choice is not well substantiated. Selective publication actually performs better when true effects are variable instead of constant.

We hope that this exchange has stimulated researchers to look at the broader picture of selective publication versus publishing everything.


References
[1] Van Assen MALM, Van Aert RCM, Nuijten MB, Wicherts JM (2014) Why publishing everything is more effective than selective publishing of statistically significant results. PLoS One 9: e84896. doi: 10.1371/journal.pone.0084896
[2] Young NS, Ioannidis JP, Al-Ubaydli O (2008) Why current publication practices may distort science. PLoS Med 5: e201. doi: 10.1371/journal.pmed.0050201
[3] Popper KR (1963) Conjectures and refutations: The growth of scientific knowledge. London: Routledge.
[4] De Winter J, Happee R (2013) Why selective publication of statistically significant results can be effective. PLoS One 8: e66463. doi: 10.1371/journal.pone.0066463
[5] Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, et al. (2009) The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: Explanation and elaboration. PLoS One 6, e1000100. doi: 10.1371/journal.pmed.1000100
[6] Hedges LV, Vevea JL (1998) Fixed- and random-effects models in meta-analysis. Psychol Methods 3, 486–504. doi: 10.1037/1082-989X.3.4.486

Competing interests declared: We have published results that are disputed in this work

RE: Publishing everything is fundamentally suboptimal for scientific progress

hgremmels replied to jcfdewinter on 01 Feb 2014 at 13:42 GMT

Firstly I’d like to congratulate both author groups (i.e JCFdW & RH as well as MvA, RvA, MN & JW) on their interesting papers and would like to commend JCFdW & RW for daring to take an unorthodox position.

Not particularly experienced in this subject myself, I struggle to see ‘real-world’ scenarios where the findings of JCFdW and RH would apply, i.e. where selective publication would be advantageous. To take their example of a room full of 1000 people, JCFdW and RH argue that it would be more efficient to let a few extreme (and contradicting) opinions be heard than to take a random sample. While I agree with this, it does require either prior knowledge of all opinions or a prior interrogation step (before access to the microphone) of opinions in the room. If an added interrogation step is included, figure 5 in JCFdW & RW shows that already ca. 37 interrogations would be required to identify 4 diverging opinions. This would be only advantageous if the relative cost of interrogation (or conducting a given experiment) is very low compared to the cost of speaking (publishing), as was already hinted at by MvA, RvA, MN & JW.

It seems to me that the metaphor can be carried over to academia and that selective publishing would be only advantageous if a) there is already a large number of known experimental outcomes (perhaps from high-throughput studies) b) publishing is very costly compared to conducting the actual experiments. I wonder if the authors can think of applied scenarios where the above would hold true? Are the costs of preparing a manuscript (salaries, computer cost, publication fee (1) known per discipline?

Again, my compliments to the authors for starting an interesting discussion,

Sincerely,

Hendrik Gremmels

1. R. Van Noorden, Nature 495, 426–429 (2013).

No competing interests declared.

RE: RE: Publishing everything is fundamentally suboptimal for scientific progress

jcfdewinter replied to hgremmels on 08 Feb 2014 at 21:52 GMT

Thanks for your comment.

The costs of publishing are high.

As you know, PLOS ONE charges $1350 to publish an article. Van Noorden (2013) reports that the average cost to the publisher is around $3500–4000 per article. The publication costs per article may well exceed the cost of the research itself (e.g., survey research).

These costs do not even include the time and resources spent by authors, reviewers, volunteer editors, university administration (e.g., repositories, accounting), and optional English language editing services. Indeed, researchers spend a large portion of their time writing papers. As early as 1979, Latour and Woolgar observed that researchers spend more time to producing papers than to making discoveries. Ioannidis (2010) recently coined the term “writing incontinence”, and lamented that in some cases reviewers cumulatively even spend more time on a paper than the authors themselves.

The number of published articles per year increases exponentially, with “no indications that the growth rate has decreased in the last 50 years” (Larsen & Von Ins, 2010). For example, a search in ScienceDirect reveals that the number of published papers is 493545, 526380, 555342, and 619130 in 2010, 2011, 2012, and 2013, respectively. How long could such exponential growth be sustainable? It may lead to incredible burden on reviewers and editors, or an inevitable decline of research quality.

In conclusion, it seems quite reasonable to question the idea that ‘publishing everything’ is effective. Maximizing the transmission of information per paper seems a plausible alternative.

References

Ioannidis, J., Tatsioni, A., & Karassa, F. B. (2010). Who is afraid of reviewers’ comments? Or, why anything can be published and anything can be cited. European Journal of Clinical Investigation, 40, 285-287.

Larsen, P. O., & Von Ins, M. (2010). The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84, 575-603.

Latour, B., & Woolgar, S. (1979). Laboratory life: The social construction of scientific facts. Princeton University Press.
Van Noorden, R. (2013). The true cost of science publishing. Nature, 495, 426-429.

Competing interests declared: Joost de Winter is author of:

de Winter, J., & Happee, R. (2013). Why selective publication of statistically significant results can be effective. PLOS ONE, 8, e66463.