A few weeks ago there was some discussion in the Twittersphere about the ethical side of preprints. The focus was on what should happen when a preprint is found to be flawed and whether the same kind of retraction process as for articles can and should be applied.

Partly as a result of this, preprints were put on the agenda of the Committee for Publication Ethics (COPE), who hosted a discussion about preprint ethics and put out a public call for comments.

I was also recently invited to discuss the ethics of preprints at a meeting of the ISMTE along with Jennifer Lin from Crossref. I promised on Twitter that I would share the slides, but in the event it was a Q&A session, so you get this blog post instead.

What’s the fuss about preprint ethics?

Are preprint ethics just a discussion of ‘what-if’ scenarios or hypothetical musings? Not really. Preprints without methods sections have been published, and quickly updated. An author recently forwarded me an email about a preprint where a journal had rejected their article after peer review and production because it had appeared as a preprint: a frustrating and entirely avoidable scenario. I also see other requests to withdraw preprints for a variety of reasons, mainly from authors but occasionally also from third parties.

An approach to preprint ethics

To get a handle on preprint ethics, I find it useful to put issues into a few categories (this isn’t intended to be an exhaustive list):

1. Issues that are the same for preprints and journal articles: data manipulation, authorship, plagiarism (and self-plagiarism), copyright infringement, salami slicing, excessive self-citation, citation cartels…

2. Issues that are (more) specific to preprints: incomplete methods or data, multiple posting of preprints.

3. Issues specific to preprint servers: minimum standards of content/screening procedures, who should handle complaints, lack of universal ethical standards, limited resources to investigate complaints.

4. Issues where preprints meet articles: how to handle different versions, directing readers to the most reliable version, first reporting of results, citation of preprints in journal articles (how and when), copyright ownership and licensing.

There’s just a couple of specific issues I’d like to briefly address here. I will likely cover some others in future posts: there is a lot to discuss and currently little consensus.


Removal of preprints once they are online is an area where there remains significant questions. Should they be removed at all? After all, they come with no guarantee of quality. Also, once online they are immediately picked up by various search engines and downloaded: removing them doesn’t make them go away.

In terms of removing preprints for quality issues, in my view it comes down to how authoritative a preprint is seen to be: a result of its standing in the community. It’s pretty easy to argue that misleading works considered to have some degree of authority should be removed. As preprints are in the grey area between discovery and validated knowledge, this is open to interpretation and likely to vary from discipline to discipline. Another argument is that, unlike journal articles, preprints can be rapidly updated or modified by the authors, so many identified problems can be fixed with a new version. Allowing comments on preprints can help in cases where the authors don’t wish to update: the reader can read the preprint and comment and draw their own conclusions.

Once a preprint has been removed, there is the question of how it is done. Should it simply disappear? Crossref advises against this for preprints with a doi. If preprints are meant to be citable, then there should at least be a statement saying what the preprint was and that it has been removed. Should a reason be given for removal? I don’t know of a preprint server that does so. I suspect the reasons vary: they just don’t think it is necessary for tentative work, an unwillingness to put out statements that could be questioned, a wish to protect the authors on which they rely for submissions.


How, when and whether preprints should be cited has been covered elsewhere (one excellent post here). I just want to make the point that citations are a double-edged sword. Someone involved in the early stages of ArXiv recalled to me that citations were one of the reasons that physicists started to use the service: extracting the bibliography and putting it on a forerunner of INSPIRE meant that by using a preprint service, physicists could improve their profile in the field. This, like most research reward systems, leads both to increased use and the possibility for gaming. The cross-subject equivalent is Google Scholar and I have seen preprints submitted with substantial numbers of self-citations which appear to be an effort to improve Google Scholar citations. If the research is good, should that be enough to reject a preprint? Where to set the barrier is currently far from clear, with individual preprint servers or repositories currently left to set their own policy.


There is a need for further discussion about some of the issues above. Some of it needs to be had between those running preprint servers and some within author communities, perhaps led by societies or funders. Publishers should expect authors to declare preprint versions of their article at submission and clearly note their policy for accepting preprints. I don’t think this is the last you will hear about these issues.

