Chemoinformatics research group

Chemoinformatics research involves the application of computer and information systems to problems in the field of chemistry.


Key research themes

Chemoinformatics research involves the application of computer and information systems to problems in the field of chemistry.

We work in partnership with pharmaceutical and other chemical companies, as well as other academic departments in the University of Sheffield including Automatic Control and Systems Engineering, Chemistry and Neuroscience.

Our main areas of focus are the development and evaluation of virtual screening methods including 2D and 3D similarity searching; de novo design in which novel compounds are designed to fit various constraints; and the application of chemoinformatics techniques to drug discovery.

We have particular expertise in data mining, graph theory and evolutionary computing.

Current projects and research areas

Current research carried out by the Chemoinformatics research group focuses on three main areas:

  • Virtual screening methods – the computer-based prediction of the properties of compounds
  • De novo design – the design of novel compounds to fit various drug design constraints
  • Applications of chemoinformatics techniques to drug discovery problems. This is carried out through collaborations with the departments of Chemistry, Biomedical Sciences and Neuroscience at the University of Sheffield. We have particular expertise in algorithmic techniques such as data mining, graph theory and evolutionary computing.


Understanding Alzheimer’s and developing treatments (D3i4AD)

Read more

Alzheimer’s disease is the major cause of dementia with nearly 40 million sufferers globally and no known cure.

The Diagnostic and Drug Discovery Initiative for Alzheimer's Disease (D3i4AD) consortium is an EU-FP7 Marie Curie Industry-Academia Partnerships and Pathways (IAPP) project which aims to develop chemical biology tools to better understand the role of prion protein in Alzheimer’s disease and to harness this understanding to develop new chemical compounds for diagnostic and therapeutic applications.

The consortium partners are the University of Sheffield (Information School and the Department of Chemistry), the University of Lisbon, the University of Bari, Eli Lilly and Biofordrug.

Our role in the project is to apply chemoinformatics tools to model the interactions between prion and its binding partners including amyloid-b, BACE, Fyn, Cav-1 and Tau and to identify small molecules that can perturb those interactions.

More details can be found on the project's website.

Array design for lead optimisation in pharmaceutical research

Read more

Aim: To investigate and develop tools to assist medicinal chemists in the design of compound arrays during the lead optimisation stage of drug discovery. Lead optimisation is a complex, time-consuming task which chemists use to carry out to identify a candidate molecule to progress to clinical trials.

Sponsor: GlaxoSmithKline

Principal Investigator: Val Gillet
Co-investigators: Peter Willett

Re-using plant and waste feedstock

Read more

The BioHub project is a collaborative project funded by Innovate UK with partners from academia (the universities of Sheffield, Manchester and Liverpool) and industry (Unilever, British Sugar, Croda and Cybula) that aims to develop an Integrated Knowledge Management System (IKMS) to facilitate the design of commerically relevant products from biorenewable waste streams.

The IKMS system comprises a data repository developed by colleagues at the University of Manchester. The repository uses semantic web technologies to store data about biorenewable feedstocks and their chemical components together with chemical reactions which are commonly used to convert feedstocks to new products.

At Sheffield we have developed a MultiObjective Search Tool (MOST) which takes inputs from the repository and identifies sequences of reactions that can result in the conversion of biorenewable ingredients to final products with well-defined, commercially valuable properties.

Colleagues at the University of Liverpool are involved in the synthesis of the design molecules. An example use-case is the synthesis of novel bio-surfactants as detergents from waste streams produced from the processing of sugar beet.

Building industry standard tools

Read more

A key part of drug discovery is the virtual screening process which is used to identify molecules in a chemical database that are most likely to exhibit the required drug action, making them priority molecules for further investigations in the discovery process.

This means that virtual screening increases the cost-effectiveness of drug discovery and, most importantly, that it helps to develop drugs for patient use more quickly.

Virtual screening has been a key area of research within the Information School for more than twenty years.

Our work has had global impact through developments that include three well-known computer program which allow more effective virtual screening to be carried out for drug discovery.

The programs – GOLD, GASP and GALAHAD – focus upon two of the most important virtual screening techniques to make molecule identification easier and more effective. More specifically, the GOLD program identifies molecules that are a good fit to the three dimensional shape of a target protein.

The GASP and GALAHAD programs have been designed to support the identification of structural features that are common to molecules which have previously been found to display the features required for a drug.

All three programs support quicker and more accurate drug discovery which is of significant value to the pharmaceutical industry and to patients.

The programs are commercially successful and are used extensively within the pharmaceutical industry and academia.

GOLD is used by around 60 pharmaceutical companies and more than 600 universities across the world, while GASP and GALAHAD users include leading chemical companies such as GlaxoSmithKline, DuPont, Bayer and Novo Nordisk.

The scientific literature contains examples of the use of these programs by major pharmaceutical companies to support drug discovery research in cancer (Genentech), heart disease (Pfizer and Proctor & Gamble), cellular signalling (GSK), cognitive impairment such as schizophrenia and Alzheimer’s (Abbott), and obesity (Takeda).

GOLD is available from the Cambridge Crystallographic Data Centre.

GALAHAD and GASP are available from Certara.

Pharmacophore elucidation

Read more

Aim: To develop a programme to identify structural features common to molecules that have already been shown to be biologically active. The GASP programme was developed alongside the GOLD programme and was subsequently commercialised by the US chemoinformatics company Tripos. A subsequent collaboration with Tripos led to the development of a new program also distributed by Tripos called GALAHAD which incorporates an improved method for superimposing molecules. Both GASP and GALAHAD have been licensed by many of the world’s leading pharmaceutical companies. The group has continued to work in this area through collaboration with AstraZeneca and the Cambridge Crystallographic Data Centre.

Sponsor: AstraZeneca

Group members

Click on any name to see contact details and more information.

Academic staff

Professor Val Gillet (Head of Group)

Professor Peter Willett (Emeritus)

Research staff

Dr Zied Hosni

PhD researchers

Terence Egbelo

James Middleton

Hanz Tantiangco

Funders and collaborators

Research carried out within this group is funded by a wide range of organisations

  • AstraZeneca
  • Biotechnology and Biological Sciences Research Council (BBSRC)
  • GlaxoSmithKline
  • Engineering and Physical Sciences Research Council (ESPRC)
  • Eli Lilly
  • Lhasa Limited
  • European Commission
  • Parkinson's UK
  • Technology Strategy Board/Innovate UK
  • Unilever
  • Evotec

A global reputation

Sheffield is a research university with a global reputation for excellence. We're a member of the Russell Group: one of the 24 leading UK universities for research and teaching.