Preserving

Making decisions about the long-term preservation of your research data includes thinking about retention periods, file formats suitable for long-term preservation, and finding a place where to deposit the data.

Where to deposit
Preserving the data generally means that the data should be deposited in a repository or archive during the project or shortly afterwards. There are many options, but in any case a record should be created in the SHU Research Data Archive (SHURDA) which points to the URL of the deposited data.

1. Deposit with the SHU Research Data Archive (SHURDA)

If your funder does not expect you to deposit your data in a designated data archive and your journal does not provide a facility to preserve your data that also meets your funder’s requirements, then you could use SHU’s institutional data repository.

2. Deposit with an external data archive

Some funders have set up data archives specifically for the curation and dissemination of data created as part of their funded programmes. Examples are ESRC’s UK Data Service and the seven data centres that NERC supports. Researchers are often expected to deposit their data in these designated data archives. Other research funders expect you to deposit your data in an institutional or subject specific repository that is not supported by a research council.

Find your funders’ requirements

  • SHERPA/JULIET is a database that lists all research funders’ open access policies, including their rules for depositing in specific data archives

Find data archives

Some considerations when deciding on a repository

  • Are the repository’s terms and conditions acceptable?
  • Will your dataset be given a permanent DOI? A DOI provides a permanint link to your dataset that will never change, even if the website is redesigned or if you leave the university
  • What type of data does the repository accept and what is its subject focus? Is the repository used by the people in your discipline? Does the repository already have a good reputation in your field and is it recommended by your funder or your journal?
  • Does the repository allow you to describe your data sufficiently, so that it is easy to find and easy to cite by others?
  • Are access restrictions and embargoes permitted?
  • Is the archive established and well funded so that you can rely on it still preserving your data in 10 years’ time?

If you are considering using an external data archive and need advice, please contact library-research-support@shu.ac.uk.

3. Submit your data to a journal

An increasing number of journals require that authors make their data promptly available to others without undue restrictions, such as the journals that are part of the Nature Publishing Group and the Public Library of Science PLOS. These data must generally be available to reader from the date of publication, and must be provided to the editor and peer-reviewers at submission. Some of these journals encourage data to be submitted as supplementary materials to the article; other journals require the data to be deposited and published in a repository. It is worth checking your journal’s data policy.

A note on project websites

You can also make your data available on your own project website, but this is generally not recommended. If you make your data available via a website, than you should also deposit the data in a discipline-based repository or SHU’s institutional data repository. Project websites offer little sustainability for your data for the longer-term, and unless you put specific procedures in place, it may be difficult to control who uses your data and how they use it. Also, a dataset in a repository is usually far easier to find by your peer researchers than an individual’s website.

Retention periods
SHU

The University’s Research Data Management policy stipulates that ‘data must be stored for a period at least as long as that required by any funder or sponsor of the research, any publisher of the research or as set out in the University’s Research and Knowledge Transfer Records Retention Schedule’.

The University Records Retention Schedule clarifies that primary data generated by research — both on paper and in electronic form, and both by staff and postgraduate students — should be kept for a period of

expiry of “privileged access” / embargo period + 10 years
OR
last date on which access to the data was requested by a third party + 10 years

Should an external funder stipulate a longer retention period, then the longer retention period shall apply. If the legal contract governing the research stipulates a longer or shorter period, then the retention period set out in the contract shall apply. For clinical and health studies that are funded by the Medical Research Council, retention periods are considerably longer — 20 yearscientific-data-policys if consent of individuals/patients was obtained, 30 years if it wasn’t.

UKRI (formerly RCUK)

UKRI is considering the need for new research integrity policies. Meanwhile The RCUK Policy and Guidelines on the Governance of Good Research Conduct continue to apply for AHRC, BBSRC, EPSRC, ESRC, MRC, NERC and STFC investments, until further notice. The RCUK Policy and Code of Conduct on the Governance of Good Research Conduct states that the researcher must ‘make relevant primary data and research evidence accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for ten years, but for projects of clinical or major social, environmental or heritage importance, for 20 years or longer’.

The research councils have varying requirements. You will find an overview of funders’ data sharing and retention policies in the table below. The DCC also provides information about funders’ data policies.

Overview of funders’ data sharing and retention policies

Research Council Minimum length of time data should be kept Starting from Where to be kept
AHRC 3 years within 3 months of project completion archaeology grant holders to deposit in the Archaeology Data Service (http://ads.ahds.ac.uk/); for other subjects no archival service is provided
BBSRC 10 years no later than the release of main findings through publication, or after completion of project no archival service is provided
EPSRC 10 years end of researcher ‘privileged access’ period or from last date on which access to the data was requested by a third party, whichever is later no archival service is provided
ESRC not stated within 3 months after project completion UK Data Service (http://ukdataservice.ac.uk/)
MRC 10 years minimum but some data need to be kept longer (depending on the type of study) in a timely manner but a limited and defined period of exclusive data use is reasonable no archival service is provided
NERC not stated at the end of a project, or after a ‘reasonable period’ of exclusive use, normally a maximum of 2 years from the end of data collection expected to deposit in a network of seven data centres (http://www.nerc.ac.uk/research/sites/data/)
STCF 10 years but data that is not re-measurable should be kept ‘in perpetuity’ within 6 months of publication several data centres are in place but deposit is not mandated
NC3Rs 10 years minimum but some data need to be kept longer (depending on the type of study) in a timely manner but a limited and defined period of exclusive data use is reasonable no archival service is provided
Cancer Research UK 5 years following the end of a grant no later than the acceptance for publication of the main findings. A limited period of exclusive use of data for primary research is reasonable no archival service provided
Wellcome Trust 10 years on publication but opportunities for timely and responsible pre-publication sharing of data should also be maximised no archival service provided
File formats
It is useful to consider which file formats you will use for your data, since the choice of file format has repercussions for the long-term access to your data. All digital files depend upon hardware and software for access, and it may be that the file formats you choose will become obsolescent in the future.

The safest option is to use open formats (such as comma-separated values or CSV) and not proprietary formats, although some proprietary formats (such as SPSS, PDF, Excel and Word) are widely used and likely to be accessible in the long term. Formats that enable long-term preservation and sharing of data are listed in this table of recommended formats from the UK Data Service.

It may be that you will use different file formats for creating and processing your data, depending on the hardware, software and staff expertise available, or on discipline specific practices. In that case, you may need to consider converting from the original formats into formats that are suitable for preservation.

Costs
If you are applying for RCUK funding, any anticipated costs that you incur for preparing and ingesting the data into a repository or archive can be directly costed into you grant proposal. You should provide adequate justification for the costs. Also keep in mind that any expenditure must take place before the actual end date of your project.

For more information, see