Research Preprints

Discussing early research outputs

A short research project: (Where) do preprints fit in?

December 18, 2017 mrittman

How do preprints fit into the research ecosystem? This is the question motivating a short project I will carry. I am particularly interested in how links work between preprints and other early stage outputs, such as code, data, pre-registrations, and blog posts.


The question of what a preprint is and why they are important is a common one. It is mainly addressed from the point of view of two players: the authors and the funding bodies. Since they are the ones that typically make the choice about whether a preprint should be posted or not, it’s not a big surprise. However, a major beneficiary should be the research community at large.

Apart from setting behavioural norms, there isn’t much direct influence from the community about what does and doesn’t appear as a preprint. However, the process by which preprints are posted can have a major effect on how useful they are to the wider research community. If preprints are not visible, linked, sharable and reusable, the research community loses out.

Getting it right when posting preprints can bring huge benefits. The idea behind this project is that preprints can act as a hub for early research outputs. Early stage research outputs often only tell part of the story. There is also code, data or pre-registered analysis: in isolation they leave a large number of questions unanswered. Where does someone go to fit all of the parts together? A preprint can pull together all the aspects of early stage outputs into a hypothesis-driven investigation with a clear logical structure, and in a way that can be reproduced and understood by others.

The approach

To find how and whether preprints can link early stage outputs, I will come at the question from three different angles:

What can be done?

I will look at preprint server submission systems to see whether it is possible to add links to various types of early stage output.

What is being done?

I will check published preprints to see whether authors are making use of the options available for linking their research to other outputs, or whether they find ways around if they are not available.

What could be done

I will survey those running preprint servers about their attitudes towards linking data, code and other outputs to preprints.

The method

These are the categories of early stage research output I will look at:

  • Supplementary data (published directly with the preprint)
  • Data
  • Computer code
  • Previous versions of the same work
  • Later versions of the same work
  • Registered controlled trials
  • Pre-registered methods/analysis
  • Database accession IDs (e.g. Protein databank)
  • Website/blog
  • Social media accounts

The survey will consist of the following questions:

  • Name, email address (optional)
  • Preprint server
  • Main field(s) covered
  • Approximately how many preprints has the preprint server published in 2017?
  • When posting preprints, are authors able to add links to the following: [use the list above]
  • In the submission system
  • In the published version
  • No, we do not permit links to external material
  • By another method (please specify)
  • Further comments [text box]
  • What is your overall impression of how often authors use the options above: [Majority of preprints (>50%), some preprints (10-50%), rarely (<10%), never, n/a]
  • If you do not have the options above, are there any that you plan add to your platform by the end of 2018? [Have already, planned, no plan]
  • Do you have an online policy about links to data and other early stage research output? [URL, comments]
  • Do you have any further comments about preprints and links to early stage research outputs? [text box]


The sample sizes will be small, so the outputs will mainly be qualitative. I will look at differences between fields, and between old and new preprint servers.

Considering the broad fields: biology, physics, chemistry, mathematics (including staistics and computing), social sciences, humanities, engineering, and earth sciences/geography, I will look at whether authors have an option for linking their preprint to each type of additional output.


Do you want to get involved? In true preprint fashion, I’m looking for feedback on the project plan, so please either email me or make a comment below. I’m also looking for collaborators, so get in touch if you are interested.
The results will be presented in a flash talk and poster at the Open Science Conference in Berlin, February 2018. All data collected will be made public and there will, of course, be a preprint.

Leave a Reply

Your email address will not be published. Required fields are marked *