Data and Software Unleashed
Written by Dr Jenni Adams, Open Research Manager at the University Library.
Making research data and software visible and reusable has a number of benefits both for individual researchers and the research community as a whole – for instance, it enables researchers to build on others’ research data, increase the reproducibility and verifiability of research findings, create opportunities for collaboration and speed up the progress of academic research.
Unleash Your Data and Software competition invited research students and staff at The University of Sheffield to apply for an award of up to £5000 to make their research data or software more visible and reusable.
The competition forms part of the University’s FAIR roadmap – our plan to achieve the strategic priority to ‘create an open research culture that values a range of contributions’, ‘deliver […] best practice in research integrity and ethics’ and ‘adhere to the FAIR principles to the benefit of society’.
Funding was available for three categories of projects:
- Hidden Gems: Legacy data currently unavailable for reuse, that could be made available through ORDA with the use of these funds.
- Crown Jewels: Innovative projects for the promotion of existing datasets in ORDA or other repositories using visualisations or events (such as symposia/expos) that increase the profile of the dataset and increase understanding of its potential for reuse.
- Helping software to meet the FAIR4RS principles: Enhancing the visibility, reproducibility and reusability of research code & software by making it open-source and publicly contributable; adding documentation, tests and citation metadata; refactoring to aid in usability.
We received a high number of applications for funding for a diverse range of projects.
A judging panel chaired by the University’s Research Practice Lead, Dr Tom Stafford, and including representatives from the Library, IT Services and Research Software Engineering, selected the following projects for funding after a competitive judging process:
Cataloguing the University of Sheffield Herbarium – Rosa Dunkley (Biosciences)
The University of Sheffield Herbarium is a biologically and geographically diverse collection of 10,000 dried plant samples spanning 200 years which has been abandoned since the 1990s. This project will use open-source software to create a working, accessible catalogue suitable for digital upload, enabling the collection to be used for educational and research purposes going forward.
SIPHER Project: Open-source Release of the Decision Support Tool – João Duro (ACSE)
An open-source release of a Decision Support Tool tasked with exploring the wellbeing impacts of policies, including their economic and equality outcomes. This will potentially benefit city councils and other decision-making bodies across the country, in addition to enabling scrutiny of the tool and promoting collaborative links with other institutions.
Gene expression in heart attack patients dataset – Aya Elwazir (Department of Infection, Immunity and Cardiovascular Disease)
Developing an app that provides a user-friendly interface to quickly visualise and analyse gene expression in heart attack survivors over time. This will generate usable data from existing research samples and will allow the underlying code to be reused on similar projects.
Improving open source PGFinder software for the structural analysis of peptidoglycan – Stephane Mesnage (Biosciences)
Improving PGFinder, an existing open-source software which enables the structural analysis of bacterial peptidoglycan, a ubiquitous and essential component of the bacterial cell envelope which is paramount to understanding antibiotic resistance. The project aims to increase the software’s functionality and enable its ease of adoption by other researchers.
3D polygon models of 5 Cistercian abbeys in Yorkshire – Michael Pidd (Digital Humanities Institute)
Making publicly accessible existing 3D polygon models of 5 Cistercian abbeys in Yorkshire, including Fountains Abbey, creating an immersive 3D environment which will give users the experience of being able to walk around a medieval monastery. This will provide an educational resource for schools and colleges as well as supporting new collaborations with researchers, heritage organisations and video game companies.
TopoStats software for microscopic data – Alice Pyne (Materials Science and Engineering)
The open-source TopoStats software automates the extraction of essential information from the world’s highest resolution microscopes. This project aims to increase the software’s fulfilment of the FAIR principles by adding tests, better documentation, better packaging, more trustworthy dependencies and refactored, more readable code.
CUREd database – aggregated Emergency Department attendance data – Joanna Sutton-Klein (SCHARR)
CUREd is a large database of linked routine health data from Yorkshire NHS emergency care services. Annual Emergency Department attendance rates for small geographical areas have been calculated from the data, allowing exploration of health inequalities between areas. This project will make the aggregated attendance rates and their associated metadata, protocol and code, readily available to other researchers, creating opportunities for future research as well as presenting the data visually to local stakeholders and the public.
Data from working memory training study in young adults – Claudia von Bastian (Department of Psychology)
Reformatting, documenting and sharing data from a study which assessed the cognitive performance of 121 young adults before, immediately after, and 6 months after completing 20 sessions of working memory training.
Tatool-web support and interface for experimental code – Claudia von Bastian (Department of Psychology)
Tatool Web is a free, open-source research software package for implementing and conducting online and offline behavioural experiments (von Bastian et al., 2013), and is used by a growing number of colleagues from psychology, social sciences, and beyond. This project will increase the software’s accessibility, especially for novices, and improve the existing process for sharing experimental code to enhance research reproducibility.
Plant sciences: transparency & reproducibility of metabolomics analyses – Lizzy Parker (Biosciences)
Improving open-source software to facilitate the extraction of plant secondary metabolites, their analysis using liquid chromatography mass spectrometry (LCMS) and an associated workflow for analysis of these data. The project aims to improve the transparency and reproducibility of the analyses. This will automate previously laborious tasks, enable a more standardised approach and reduce dependence on proprietary software.
Coincidence reconstruction software (CoRe) to reconstruct antineutrino interactions – Liz Kneale (Physics)
The project aims to improve a coincidence reconstruction software (CoRe) with multiple applications including antineutrino detection for particle physics. It will refactor CoRe and the underlying software to make it easier to understand and customise before making it publicly shared and contributable. The re-factored software could then be used for a variety of detector configurations, detection media and applications.
Living With Data dataset – Helen Kennedy (Sociological Studies)
The Living with Data dataset contains expert and everyday people’s opinions and perceptions of data systems and processes pre-COVID-19 and across the first two years of COVID-19 and its management. This project will ensure the dataset meets the highest technical standards of findability, accessibility, interoperability and reusability, archiving the dataset in ORDA as well as presenting it innovatively on the Living With Data website.
Thanks again to the judges who kindly volunteered their time and to everyone who entered the competition. The outputs and impacts of the projects will be showcased at an event in September 2022 – watch this space for further information.
A world top-100 university
We're a world top-100 university renowned for the excellence, impact and distinctiveness of our research-led learning and teaching.