Research Preprints

Discussing early research outputs

Whither preprints?

June 15, 2017 mrittman

In recent times there has been a proliferation of preprint servers and a much larger uptake on the part of authors. Few objections have been raised to this trend and I wonder whether this is almost everyone in the scholarly bubble see preprints as fitting into their own world view: traditionalists see no threat to journals and supportive of the editorial process. Those seeking change see the potential for preprints to replace journals, or at least greatly alter the status quo.

In this post I want to spell out some of the possible future scenarios. The most likely future, of course, is that one of these visions will not dominate: different disciplines will come to their own conclusions and it should be up to research communities to decide.

The main issue in looking at different scenarios is how one moves a piece of research from the tentative/draft phase into the corpus of accepted literature, or whether such a distinction is even necessary.

Here are four possible scenarios for preprints in the future.

The status quo

Here, journals stay as the guardians of accepted research and operate as they do currently. Preprints are a tool for getting early feedback and making some results known ahead of time but are viewed as very much inferior to journal articles and not considered essential to the publication process.

This situation seems to prevail for physicists, even though they regularly post to arXiv: journal publication after peer review is still very important, especially when it comes to promotion and funding. Despite recent moves for acceptance of preprints in grant applications, e.g. by the NIH, there haven’t been similar announcements about job applications or promotion. The attitudes of universities and other research institutions is probably critical for moving away from the status quo or maintaining things as they are.

Preprints plus journals

In this scenario journals continue as they are but preprints are also recognised as first class research objects. Preprints can be cited, used in hiring and promotion decisions, and grant applications. They are read with a healthy amount of scepticism but cited where appropriate and fix the first reporting of research results (i.e., they provide scoop protection).

From what I can gather, this is the aim of ASAPbio. Gaining recognition from a broad range of institutions is a key element here and the major difference between this and the previous scenario. For publishers, journal publication goes on as usual, but it could lead them to modify how they solicit articles and give options for slower, more thorough peer review (see, e.g. this post from Kent Anderson at the Scholarly Kitchen).

Preprints disappear

It is not a scenario I hope for, but it is possible that in some fields preprints will never catch on. Either not enough is communicated about their benefits, or there may be specific areas with a specific objection. A field (although I struggle to think of one) where speed of publication is not important stands to gain less from wider use of preprints. Some may argue that fields where individual papers can have very large impact may require greater validation before making research public. Putting the lid on scandal like the vaccine-autism link controversy on a regular basis could cause headaches for scientists. There are good reasons I don’t think it would play out like that, but I’ll leave them for another discussion.

The main way to avoid this scenario is clear articulation of benefits and proper engagement with reasonable objections to preprints. So far, the discussion I’ve seen has been mostly high quality and polite, long may it remain that way!

Overlay journals

A compelling argument is the idea that peer review is costly and inefficient and preprints offer a low-cost and effective alternative. Some kind of validation, such as community-organised peer review, can be made after a preprint has been put online. This has already happened (e.g. Tim Gowers’ Discrete Analysis and Andrew Gelman proposes a super-arXiv overlay journal).

It is, of course, unlikely that publishers and editors would readily agree to such a move and a significant cultural shift needs to take place for this eventuality to prevail, even within a single field. The main objection from publishers is about protecting income: a shift from the $5000 per article average income now to something around $100 would put almost every publisher out of business based on current practices – both for-profit and non-profits. However, some fields with limited funding and where fee-based open access is viewed with scepticism may find this a very attractive proposal

A step further questions the necessity of journals at all. Why does the opinion of two or three reviewers and one editor provide a solid validation of research? Peer review has been frequently questioned, but this option would require a new way to validate and test results. I am not aware of any serious proposals in this direction, but I think it’s a space well worth watching in the coming few years.


Which is the best of these scenarios? Unlike the debate on open access, I don’t think this needs to become binary and polarized. Simply posting a preprint does not favour any of the above: the main differences are about how the rest of the research ecosystem values and validates a preprint. It is certainly possible that multiple scenarios can exist side-by-side. It is an exciting time for preprints and I hope that discussing these kinds of options moves higher up the agenda of decision-makers in scholarly research communication.

Open Preprints

May 19, 2017 mrittman

One of the things that most surprised me when putting together the list of preprint servers for this site is that a large number don’t explicitly list any licensing or copyright information, and some routinely use very restrictive licenses. Coming from a publishing background, this was very surprising.

Subscription publishing relies on content ownership and enforcement of strict copyright conditions: If the publisher doesn’t own the copyright, the articles could be distributed by anyone. The open access movement turned this logic on its head. Someone (often the authors or their funder) pays for value added services including peer review, copy editing, hosting and distribution, but anyone is allowed to distribute the final article. The copyright and licensing terms are among the most significant features distinguishing open access and legacy publishers.

Licensing for open access is very important. Most open access publishers have gravitated towards the creative commons licenses, and in particular CC BY. Although there are differing views, simply being free to read is generally accepted as insufficient for open access. It also requires rights for reuse, in whole or in part. This means that for the strictest definitions of open access, not even all creative commons licenses are sufficient.

The preprint paradigm developed before open access and at the very beginning of the internet, when licensing conditions were not such a contentious area. As a result, many preprints were not, and still aren’t, open access compatible. If no terms are stated, as with a number of preprints I’ve seen, the default is an all rights reserved license, which means distribution and reuse are not permitted.

For authors, the lack of open access for preprints doesn’t matter much and I don’t believe many know or care a great deal about it. I have never received a request at to use license other than CC BY for a preprint, and only on about three occasions in the last four years for open access journals (over about 70,000 articles). Most authors I speak to like the ideals of open access, even if they have issues with how it works in practice. It is the rest of the research community (ironically including many of the same people) that stands to lose out. Data mining, especially, becomes a legal minefield if reuse rights are not clear. Use of figures in lectures, blogs and journal articles is problematic. Simply sharing a copy with a few colleagues could be illegal. The tragic recent case of Diego Gomez shows that this is not just a hypothetical argument. The goal for preprints to widely disseminate work is limited by the lack of a clear license. Even more, if the increasing use of preprints has aspirations to be seen as part of the push for open science, the current haphazard approach to licensing isn’t going to work.

Some have justified offering a range of licenses on the grounds that it make preprints more inclusive and the time will come for moving forward on this issue. I think this overestimates the risks and underestimates the benefits of open access, and I am yet to see a timescale for the transition. Others seem to be unaware of the issue: I recently saw a definition declaring that all preprints are open access. To reach the full potential of preprints, they should be in step with open access and aim to be fully integrated into the growing calls for open and transparent science. At the very least, I would challenge those advocating for the use of preprints to decide which side of the fence they sit on.

Where do we go from here?

May 4, 2017 mrittman

In recent times there has been a proliferation of preprint servers and much larger uptake by researchers, particularly in biology. Few objections have been raised to the concept of preprints and I wonder whether this is because most see preprints as reinforcing into their own position: traditionalists see then as no threat to journals, and supportive of the editorial process via providing feedback ahead of submission. Those at the more radical end of the spectrum see the potential to overthrow the system and ask why we need journals any more.

In this post I want to lay out four possible future scenarios for preprints. Most likely, as currently, different disciplines will take different routes and all of the scenarios may co-exist. This is not a debate that needs to polarise and finish at one end point. It is up to research communities to decide what works for them.

The main issues up for discussion are how to move a tentative piece of work into the corpus of accepted literature, and how preprints fit into the research cycle. Here are four possible scenarios:

1. The status quo

Here, journals stay as the guardians of accepted research and operate as currently. Preprints are a tool for early announcement that bypasses the often slow review and editorial process, and allows researchers to get early feedback on their work. This is more or less how physics has worked for a long time. In fact, I have heard it said that there is no great need for open access physics journals as everything is on arXiv.

This is the likely scenario for the immediate future as it doesn’t rock the boat. Researchers are generally conservative about changes to publishing and uptake of preprints in new disciplines is likely to be slow.

The status of preprints here is somewhat below journal articles. They are a kind of untested grey matter. This scenario places a great deal of faith in the efficacy of peer review and editorial decision-making, which has been much criticized. It seems a missed opportunity if preprints do not contribute to assessing research at least to some extent.

2. Preprints disappear

This is a scenario that I don’t really want to think about, but is a possibility. If there is a lack of widespread uptake of preprints and they are not recognised as valuable by funding bodies or in research assessment, they will become a burden to researchers and will gradually disappear: no new preprints will be added. The current signs point against this scenario, but only in certain fields. It is possible that preprints will never gain traction in other fields. There may be arguments against preprints, for example where clinical recommendations or patent applications are involved.

The way to avoid this endpoint involves some lobbying of influential organisations, as well as reaching a critical use mass for preprints. Tangible benefits for moderate effort need to be demonstrated.

3. Overlay journals

An overlay journal is one that directly publishes preprints. The editorial process takes place once the preprint is online.

In this scenario, the lines between preprints and journal articles become blurred. There is an editorial process, but it focuses on assessing the preprint directly, not a separately submitted piece of work. F1000 are essentially running this system, and Tim Gowers has set up the journal Discrete Analysis as an overlay on arXiv.

This approach has the potential to dramatically reduce the cost of publishing, and bring transparency to the process and more control to authors. It also has the potential for established publishers to take control of the preprint process and tie authors into their platform. Which preprint server to post on could become almost as agonizing as which journal to submit to. I suspect there are unintended consequences also when preprints start to be used to assess individual performance and metrics applied.

An advantage of this approach, though, is that it puts preprints more concretely into the research cycle. This would be a good strategy to promote for supporters of open science.

4. No journals

For those opposed to the role of publishers as arbiters and gatekeepers of research dissemination, the idea of getting rid of journals altogether is quite appealing. In this scenario, preprints are published and readers can make their own mind up whether they are any good. Preprints replace journal articles. With flexibility to update preprints at any time, research should become more self-correcting than it is currently. Most researchers already rely on search engines and related algorithms to find work for them, so tagging an article as belonging to a specific journal is obselete.

On the other hand, how does someone new to the field rate articles in this scenario? Some papers will get a lot of attention, whereas a great deal of incorrect, uninteresting research, will be left untouched, unread and wrong. At least with journals ever article goes through a checking process, even if it has some flaws.

To conclude, each of these scenarios has strengths and weaknesses and, as I said above, there is no one-size-fits-all solution. The main question to ask is whether they strengthen research output and lead to effective, creative work.

What is a preprint?

March 16, 2017 mrittman

With the rise of new preprint servers, and especially multiple offerings in the same discipline, some effort should be put into thinking about what it is that makes a preprint a preprint. This post is my take on the issue.

A preprint is about three aspects: content, availability and timing.


The preprints we are concerned with are additions to the research literature. To state the obvious, any work following the scientific method should qualify. The question then is how widely should the net be thrown to include other article types? It should be uncontroversial to include research articles, reviews and essays which form the backbone of output in science and the humanities. However, the literature includes a lot more: editorials, opinions, comments and so on. In addition, the concept of micropublication has been suggested, i.e. publishing a single part of a traditional paper, such as only the methods, results or discussion. With the current publishing paradigm, one could suggest including anything that could be published in a journal could be made a preprint, but thus is unsatisfactory as the role of journals might unexpectedly change and journals have individual policies. It also excludes work at a more preliminary stage. I think it is useful to split the literature into 1) research: hypothesis driven investigation and 2) grey literature: informed conversation about research (written by researchers). Both could be considered for preprints, but highlighting the difference via an assigned article type should be done in practice.


A preprint should be available to anyone. I would qualify this by saying that open access is desirable but not necessary: A basic definition of a preprint should permit a broad range of copyright and licensing criteria. Would it be acceptable to paywall a preprint? I would argue strongly against this option and it seems quite pointless, but don’t think it should discount something as being classed as a preprint.


Preprints are about reporting work at the earliest possible stage. “pre” in the name is because they come before validation by the research community.

The primary mode of validation currently is peer review and journal publication, but the definition shouldn’t be restrictive. New processes of confirming results could emerge in the future and should be connected to preprints. Grey literature is usually not peer reviewed, so editorial review and publication is sufficient to count as validation.

Should postprints or accepted versions of papers be mixed in with preprints? The difference between what is a preprint and postprint should be about when the first version is put online. In practice, it is acceptable to update a preprint with an new version, including a peer-reviewed one. The lack of a fixed end point is, to my mind, a strength of preprints and should be permitted within a working definition. As outlined below, there are situations where it is critical to know if something has been peer reviewed, so there should be a differentiation between the two.


In summary, I would define a preprint as a piece of research made publicly available before it has been validated by the research community.

Not to be confused with the above, and a topic for a future post is the question of what is a preprint server. Not all preprints appear on a preprint server and not everything that appears on a preprint server is necessarily a preprint.

Does it matter?

A number of preprint servers don’t publish preprints strictly according to this definition, for example by allowing publication of an abstract without full text, or permitting uploads of post-prints or accepted versions. I don’t think most scholars care a great deal about this, but it is important in some circumstances. For preprint aggregators, funders, journalists, medical practitioners and in research assessments it is much more important to know what has not been peer-reviewed and whether something is simply an opinion as opposed to reporting research outcomes. For this reason, the distinction between research and grey literature, and preliminary and reviewed work should be made clear.

In the spirit of preprints, I would be interested in feedback on the definition and how it can be improved. Please comment or get in touch by some other means.

A List of Preprint Servers

March 9, 2017 mrittman

Last week I put a rough version of the list of preprint platforms live, responding to a request on Twitter from Jessica Polka. I’ve now filled in most of the gaps and put it into a Google sheet, which seems the best way to display the information at present. In the future I aim to use something more fancy that will span the page and can be filtered and sorted.

I hope it will be a useful resource to authors considering options for where to place their preprint and anyone interested in an overview of the state of preprints.

Putting the list together was an interesting exercise and revealing in several aspects. Here’s a few observations that I made.

Firstly, there are not that many preprint servers: the list runs to 19 at the moment. More than half of those listed (including OSF-based servers individually) started in the last year. When you compare it to the number to journals it is miniscule, even in disciplines where preprints have played a large role for many years. I intend to exclude institutional repositories, of which I suspect there will be a great many that post preprints. There are already lists of them elsewhere and authorship is limited to those affiliated with the institution.

A major lack in most preprint servers is long-term archiving. Excepting those based at CERN, I only found one with a statement about archiving on their website (CORE from MLA Commons). This should be a high priority for those operating preprint platforms, but there appear to be few clear solutions at present.

Also lacking is a business model that does not rely on backing by one or a handful of bodies. SSRN uses a model where institutions or readers pay for extra services. Authorea charges for use of their platform (although there is a free option). Funding from a larger organisation is fine, as long as institutions are willing to pay in the long term, but it relies to some extent on good will and some servers will likely look at alternative models in the coming years.

The background to preprint servers is varied, arising from libraries, publishers, societies, author services etc. Each puts an emphasis on different aspects and the rigour in submission checks, licensing information, information for authors, inclusion of non-preprint material and so on varies. In my experience, most authors don’t particularly pay attention to these aspects, but they may play a role in integrating preprints more formally into research evaluation. Funders and universities do care more about the details. A discussion on basic requirements for preprints and an interest in whether a consensus can be achieved was one of the motivating factors for my setting up Research Preprints.

Finally, the pervasiveness of the PDF is evident. The convenience for publication and human reading wins hands down. At the moment other formats don’t get a look in, which is fine in the short term. In the longer term, this could pose a significant challenge for text and data mining, especially when format is so varied.

Making Preprints Popular

February 23, 2017 mrittman

One of the points I labelled as critical for preprints in my previous post was that they should be quickly adopted by various disciplines. Putting an e-print on arXiv is normal in a number of discplines. On the other hand, while increasing numbers of preprints are being made available in other disciplines, for example biology, they remain a small fraction of the overall number of papers published in the field.

Advocates of preprints should aim for the numbers to expand quickly. If use of preprints is not rapidly normalized and they are ignored by the majority of researchers it will become ever harder to drive continued interest. What strategies could advocates of preprints use and what are the end goals? This post focuses mainly on the former. There are, of course, different options. Here’s a few strategies that I think are viable.

Option 1: Field-by-field stakeholder adoption

Resources can be focused on one field, be it biology, engineering, physical chemistry etc. Widespread adoption can be achieved via buy-in from a relatively small number of important stakeholders. On the other hand, resistance from just one of these groups could cause doubt and confusion and stall the process. The strategy here, which ASAPbio seems to have followed, is rapid take-up in a short space of time and to incorporate preprints into the research infrastructure through acceptance by funders, publishers and institutions. This strategy can be thought of as a top-down approach where the involvement of organizations is key to persuading researchers of the acceptability of preprints.

Option 2: Broad adoption

In this strategy, an increasing minority of researchers from multiple disciplines start posting preprints. This is really a bottom-up approach, with change driven by the habits of individual users. It is immediately less disruptive than option 1, but over time institutions will need to find a way to incorporate the needs of those making use of preprints, particularly when it comes to assessing impact and citations. There is a risk for those who use preprints if the pace of adaptation is to slow, however, as their efforts to preprint would not be recognised. They could end up with a significant chunk of their work being discounted.

Option 3: Bring out the big guns

A few highly influential individuals and/or institutions showcase the use of preprints our make use of them and demonstrate the benefits. Rather than a long-term strategy, this is a kick-starter to get others involved and get preprints on the agenda. It’s a big carrot for others to look at and follow their lead.

The reality is that a combination of options is likely to be followed. I’d be interested to hear from early adopters of preprints in physics as to what the main drivers were.

These strategies could plausibly apply for a number of new ideas, but what issues are particular to preprints?

First, preprints have a base to start from and examples to follow. They have been proven influential in several disciplines and new fields should learn from ArXiv, SSRN and others.

Second is the question of how disruptive preprints will be to established systems. Currently they work alongside the normal publishing process. There are future scenarios where preprints become so important that journals are less vital than today, or even irrelevant. Is that possible or desirable? Could preprints even enhance and add value to journal publications? Are there any unintended consequences of preprint/journal interactions?

There are many who see the current preprints boom as a positive step, but it is worth considering what comes after the first step and where the destination should be. I suspect there are differing views and I would very much welcome them in comments below.

Welcome to Research Preprints

February 16, 2017 mrittman

Welcome to Research Preprints. The aim is of this site is to be a collaborative space to discuss ideas about how preprints can contribute to dissemination of research and integrated into current practices.

And it needs your input.

I am looking for contributors. The platform should present views from researchers, those running preprint servers, publishers, librarians, indexing services, members of industries and services that rely on scholarly output and anyone else who feels they are a stakeholder. One-off and regular contributors are welcome. If you have a proposal for a post, please get in touch via the contact page.

I also invite you to comment on posts. Discussion should focus on the issues at hand and be constructive. Differing points of view are encouraged, but should be expressed with respect. Almost all participants in the discussion have the same goal: creation and dissemination of high quality research by the best possible means.

For those running preprint servers or aggregators, there is a page of this site dedicated to listing these services. Please provide a short description (up to 150 words).

The idea of putting research papers online before peer review is not new, however the last 12 months has seen an growth in interest from fields outside its traditional strongholds of maths, physics, business and economics.

Expansion in the adoption of preprints raises questions, though: What is a preprint, anyway? What kind of status should preprints have in relation to the peer-reviewed literature? Are preprints a stepping stone on the way to a peer-reviewed article or could they become a new way of assessing and testing research? How sceptical should a reader be of what they find in a preprint? What are the risks associated with making preliminary work widely available?

There are questions about implementation: Should there be minimum criteria for a preprint server? How can preprints be run sustainably and could they be financially independent? Is there a business model for preprints, and should there be?

There are also issues around acceptance: How do preprints fit into the research ecosystem? What benefits can authors take from preprints and do they increase or decrease the quality of reported research?

These are just a few questions this platform has been created to address and I am looking forward to finding out some of the answers.