A potted history of peer review - Part 2
So far we have dived into the historical origins of the peer review concept and the earliest attempts at an external referee system used by the Royal Society. Even in this early exploration, we have encountered issues with peer review that persist today.
The first part of this series concluded with the beginnings of peer review, but we still hadn’t arrived at peer review as we know it in the 21st century. So, what exactly are we talking about when we say peer review? Traditional peer review involves 2-3 external “peers” who provide anonymous reports that are mostly only shared between the reviewers, editor and authors. These peers are primarily those running research groups, with prominence in their fields and largely located in Western Europe and North America.
World War, the 1970s and a Congressional showdown
Prior to World War 2, the use of external referees was an optional practice for both funding bodies and journals. Organisations who did not engage in these practices did not suffer any real reputational harm. In addition to this, prior to World War 2, journals were primarily run by scientific journals, often at a loss. After the War, science received a boom in funding. With this funding increase, the communication of science also rapidly expanded. One of the key figures in the expansion of journals was Robert Maxwell, who is largely responsible for the creation of the current journal system as a highly profitable business - and the topic for a future article.
With such significant growth, the job of internal refereeing was becoming more burdensome on editors. For example, in the 1950’s the editorial board of Science complained that “refereeing and suggesting revisions for hundreds of technical papers is neither the best use of their time nor pleasant, satisfying work”. They soon adopted external refereeing, outsourcing this role to uncompensated peer experts.
However, despite this peer review was still not common practice during the 1950s and 60s. In the US, the significant increase in government funded science resulted in greater public attention and the notion that science should be held accountable to members of Congress. This came to a head in the 1970s when 3 members of Congress (2 Republicans and 1 Democrat) began to publicly attack grants awarded by the NSF as being frivolous and a waste of taxpayer money. This poor decision making, they argued, was justification for greater Congressional oversight of which grants were funded. In 1975 a hearing was called to address the review procedures for grants awarded by the NSF. The NSF called numerous scientists to defend its procedures who almost all now described refereeing as indispensable for science. This ensured that the NSF successfully resisted significant Congressional oversight whilst ensuring that science would have autonomy so long as it was assessed by peers.
By the 1970s, the term peer review had become the dominant term for refereeing in the US. This new emphasis on peer review was quickly copied by journals and opinion began to shift away from relying on editorial judgement towards systematic peer review. As this occurred, peer review became a system that was viewed as ensuring quality and trustworthiness in science. The journal Nature adopted peer review for all articles in 1973 although this was not without detractors. Nobel prize winning biologist Maz Perutz stated that “as all papers sent to Nature are checked by members of the board, peer review is unnecessary”. Indeed previously papers of such prominence as the description of the DNA double helix by Watson and Crick were published without peer review, in Nature. Even as late as 1989, peer review was not standard across all journals. Editors of The Lancet, a British journal, were proud that “reviewers [were] advisors not decision makers”.
The future of peer review; change or same old?
Fast forward to 2025 and peer review is firmly commonplace. It is also faltering under the strain of an exponentially growing scientific literature. Since the 1970s, a number of studies investigating peer review such as its efficacy, how equitable it is and how much it costs have all been published. There has even been a Cochrane review which concluded that there was limited evidence as to the effectiveness of peer review in improving biomedical manuscripts. Since this report, studies comparing preprints with their peer reviewed versions have repeatedly highlighted that there are very limited changes between these two versions. AI is further exposing the limitations and failure points of the current peer review system.
So, if peer review is a failed experiment that fails to substantially improve studies or protect the literature from poor science, what does the future hold?
There are a range of experiments focussed on improving peer review. Some focus on the process of peer review, some utilise AI or crowd approaches and others focus on improving the efficacy of peer review. There is also a growing effort in post-publication sleuthing activities, termed forensic scientometrics.
Experiments that focus on the process of peer review often exclusively review preprints and function as independent services. Many of these are aimed at increasing transparency and reducing some of the inefficiencies in the peer review system. eLife for example is a proponent of the Publish, Review, Curate (PRC) model where all peer review occurs post-publication. In addition, eLife transparently shares all peer review reports and is remarkably similar to the original efforts and intentions of William Whewell, back in the 1830s. Other examples of independent peer review services include Rapid Reviews, Review Commons, MetaROR and Peer Community In. These services aim to review preprints and provide a preprint-review “package” that can be used by journals, thereby reducing the time it takes for authors to receive decisions and speeding up publication times. A significant issue with many of these efforts is that they only improve the process of peer review, not the efficacy or ability of reviewers to detect fraud or gross defects. Additionally, the PRC model is highly similar to traditional publishing, with minor improvements. This effectively means that many of the problems with publishing are replicated in this model.
Other experiments in peer review look to expand who can perform peer review. The shift to the term “peer reviewer” also heralded a shift in who can perform an acceptable peer review. This has largely been those running research groups, the peers of a manuscript. However, in recent years some initiatives have been challenging this notion and increasing equity in the peer review process. One such example is PREreview, a community focused organisation that empowers early career researchers to publicly share reviews of preprints. Another such example is utilising a “crowd” of reviewers rather than 2-3 reviewers that the traditional system relies on; this crowd approach was pioneered by the journal Synlett.
So is the future of peer review much the same or are there changes coming? The answer is most likely a mix of both, an improved peer review process that is more efficient and streamlined but one that does not improve the efficacy of peer review or protect the literature from poor quality or fraudulent studies. For that, we need a much broader range of trust indicators (a discussion for a future blog post).
Comments ()