Professor Paul Clough

BEng (York), PhD (Sheffield)

Information School

Professor of Search and Analytics

A photo of Paul Clough
p.d.clough@sheffield.ac.uk
+44 114 222 2664

Full contact details

Professor Paul Clough
Information School
Regent Court (IS)
211 Portobello
Sheffield
S1 4DP
Profile

I joined British Telecommunications Plc (BT) in 1991 on their technician training scheme and worked for 10 years at BT’s research centre called Adastral Park. During my training, I worked in various research groups in BT and studied telecommunications, electronics and software engineering at Suffolk College. During my time at BT I worked on various projects developing novel hardware and software solutions, ranging from problems in telecommunications to information management.

In 1995 I was sponsored by BT to study Computer Science at the University of York and after graduating decided to pursue an academic career in computing. I left BT and joined the University of Sheffield in 1999 working as a Research Assistant (RA) in the Department of Computer Science in collaboration with the Journalism Department and the British Press Association on a project entitled “Measuring Text Reuse”.

During this time I also completed my PhD under the supervision of Prof. Yorick Wilks. Following various interests in NLP and Information Retrieval (IR) I worked as an RA on a range of research projects until 2005 when I became a lecturer in the Information School. I now head the Information Retrieval research group in the Information School and have continued teaching and researching various aspects of data management and information storage and retrieval.

University Responsibilities

  • Head of Information Retrieval Research Group.
  • REF Coordinator
  • Co-Director of Digital Societies Network.
  • Member of the Departmental Research Committee.
  • Programme coordinator for MSc Data Science.
  • Deputy programme coordinator for MSc Digital Library Management.
  • Module Coordinator:
    • Information Retrieval
    • Information Systems in Health.
  • Staff Review and Development Scheme reviewer.
Research interests

My research interests focus on developing effective retrieval technologies that support users as they seek to fulfil their information needs. Specifically I have carried out research in the areas of multilingual search, retrieval of images, geo-spatial search, analysis of transaction logs, text re-use and plagiarism detection, and the evaluation of search systems.

I have published over 100 peer-reviewed articles, including a co-authored Springer book on multilingual information retrieval. My background in natural language processing, gained during my PhD, has allowed me to develop more sophisticated approaches to accessing information. In addition to developing techniques, I have also built up an understanding of the users of information access systems and their information needs, taking a more user-oriented view to my research.

A further theme of my research has been to create re-usable evaluation resources (corpora and test collections) for the wider research community, such as computational linguistics and information retrieval. I have been involved in coordinating activities at three international evaluation campaigns: the Cross Language Evaluation Form (CLEF) in Europe, the Text Retrieval Conference (TREC) in the US and the Forum for Information Retrieval Evaluation (FIRE) in India.

I am head of the Information Retrieval research group.

Publications

Books

Edited books

Journal articles

Chapters

  • Clough P & Tsikrika T (2019) Multi-Lingual Retrieval of Pictures in ImageCLEF, Information Retrieval Evaluation in a Changing World (pp. 217-230). Springer International Publishing RIS download Bibtex download
  • Paramita ML, Aker A, Clough P, Gaizauskas R, Glaros N, Mastropavlos N, Yannoutsou O, Ion R, Ștefănescu D, Ceauşu A , Tufiș D et al (2019) Collecting Comparable Corpora, Using Comparable Corpora for Under-Resourced Areas of Machine Translation (pp. 55-87). Springer International Publishing RIS download Bibtex download
  • Babych B, Su F, Hartley A, Aker A, Paramita ML, Clough P & Gaizauskas R (2019) Cross-Language Comparability and Its Applications for MT, Using Comparable Corpora for Under-Resourced Areas of Machine Translation (pp. 13-53). Springer International Publishing RIS download Bibtex download
  • Aker A, Ion R, Mastropavlos N, Paramita M, Pinnis M, Ştefănescu D, Su F, Thurmair G, Irimia E, Ljubešić N , Kanoulas E et al (2019) Appendices, Using Comparable Corpora for Under-Resourced Areas of Machine Translation (pp. 291-323). Springer International Publishing RIS download Bibtex download
  • Russell-Rose T & Clough P (2016) Mining search logs for usage patterns, Text Mining and Visualization: Case Studies Using Open-Source Tools (pp. 153-172). RIS download Bibtex download
  • Clough P, Hall M, Goodale P & Stevenson (2015) Supporting Exploration and Use of Digital Cultural Heritage Materials: the PATHS Perspective In Ruthven I & Chowdhury GG (Ed.), Cultural Heritage Information Access and Management (pp. 197-220). Facet Publishing View this article in WRRO RIS download Bibtex download
  • Kanoulas E, Lupu M, Clough P, Sanderson M, Hall M, Hanbury A & Toms E (2014) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface In Kanoulas E, Lupu M, Clough P, Sanderson M, Hall M, Hanbury A & Toms E (Ed.), Information Access Evaluation. Multilinguality, Multimodality, and Interaction 5th International Conference of the CLEF Initiative, CLEF 2014, Sheffield, UK, September 15-18, 2014. Proceedings (pp. v-vi). Springer International Publishing RIS download Bibtex download
  • (2013) Building and Using Comparable Corpora Springer Berlin Heidelberg RIS download Bibtex download
  • Paramita ML, Guthrie D, Kanoulas E, Gaizauskas R, Clough P & Sanderson M (2013) Methods for Collection and Evaluation of Comparable Documents, Building and Using Comparable Corpora (pp. 93-112). Springer Berlin Heidelberg RIS download Bibtex download
  • Clough PD (2011) User-related issues in multilingual access to multimedia collections In Dobreva M, O'Dwyer A & Feliciati P (Ed.), User Studies for Digital Library Development RIS download Bibtex download
  • Clough P, Müller H & Sanderson M (2010) Seven Years of Image Retrieval Evaluation, ImageCLEF (pp. 3-18). Springer Berlin Heidelberg RIS download Bibtex download
  • Grubinger M, Nowak S & Clough P (2010) Data Sets Created in ImageCLEF, ImageCLEF (pp. 19-43). Springer Berlin Heidelberg RIS download Bibtex download
  • Sanderson M, Tang J, Arni T & Clough P (2009) What Else Is There? Search Diversity Examined, Lecture Notes in Computer Science (pp. 562-569). Springer Berlin Heidelberg RIS download Bibtex download
  • Clough P () Measuring text reuse in the news industry, Copyright and Piracy (pp. 247-259). Cambridge University Press RIS download Bibtex download

Conference proceedings papers

Datasets

Teaching interests

I enjoy my role in helping students learn about topics related to data and information management at both undergraduate and postgraduate levels. Currently I coordinate modules in the Information School on the topics of Information Retrieval and Information Systems in Healthcare. I am also developing the new MSc Data Science programme that will launch 2014 and for which I will be overall coordinator.

In 2010 I became coordinator of the Information Retrieval module and revised its content and methods of assessment. Martin White (Intranet Focus and visiting Professor in the Information School) mentioned the module in his O’Reilly book on Enterprise Search as an example of the type of training teams supporting enterprise search should receive. I also deliver lectures on several other courses in the Information School, including Database Design, Digital Multimedia and Business Intelligence.

I have been invited as guest lecturer on several occasions to external organisations, including the 2013 European Summer School in Information Retrieval, the TrebleCLEF cross language search summer school in Pisa in 2009 and Universidad Nacional de Educación a Distancia in 2008. In 2008 I completed a Postgraduate Certificate in Higher Education (PGCHE) and since 2010 have been a Fellow of the UK Higher Education Academy.