Dr Mark Stevenson

Senior Lecturer
MSc Data Analytics Course Director
Deputy Director of Learning and Teaching (student experience)

Telephone: +44 (0) 114 222 1921

Member of the Natural Language Processing research group
Personal websitestaffwww.dcs.shef.ac.uk/people/M.Stevenson/

ORCID | Google scholar

Selected publications | All publications

Dr Mark Stevenson



Mark Stevenson is a Senior Lecturer in Computer Science. He is a member of the Natural Language Processing group which he joined in 1995. His PhD, on Word Sense Disambiguation, was published as a monograph. He has been Principal Investigator of projects funded by a range of sources including the EU, EPSRC and Google. He was an EPSRC Advanced Research Fellow (2006-2011) and co-ordinator of the EU-funded project PATHS. He has also worked in a range of commercial and academic organisations including Reuters Ltd (where he was involved in the production and dissemination of the widely used Reuters Corpus), Adastral Park (British Telecom’s research lab) and the Center for the Study of Language and Information, Stanford University.

Other professional activities and achievements:

  • Area chair for EACL 2017 track ``Document analysis including text categorization, topic models, and retrieval’’
    Winner of best paper award at CLEF 2004 (with Roland Roller)
  • Keynote speaker at RANLP 2013
  • Area chair for EMNLP 2013 track “semantics”
  • Assistant Director of Advanced Computing Research Centre
  • Co-ordinator of EU-funded project (PATHS)
  • Member of ACL SIGLEX board (2010-2013 and 2013-2016)
  • EPSRC Advanced Research Fellow (2006-2011)
  • Member of editorial board of Computational Linguistics (2008-2010)/


Mark Stevenson’s research focusses on Natural Language Processing and Information Retrieval. Topics he has worked on include word sense disambiguation, Information Extraction, plagiarism/reuse detection, lexicon adaptation, cross-lingual information retrieval and exploratory search. His research includes applications of these technologies to a range of areas including biomedical journal articles (interpretation of documents, extraction of information from them and data mining information from corpora), cultural heritage (automatic organisation of corpora, exploratory search interfaces) and software testing (generation of realistic test suites).


Current Grants

  • Insitute of Coding, HEFCE, 11/2017 - 03/2020, £957,000, as Co-PI
  • Data Analytics, Royal Acadamy of Engineering, 09/2017 - 09/2020, £30,000 as PI
  • Distinguishing Common and Proper Nouns, Industrial, 03/2011 - 12/2020, £31,847 as PI

Previous Grants

  • Digital Sensitivity Review, Industrial, 11/2018 - 03/2019, £39,880, as PI
  • Recommendation Algorithm, Industrial, 04/2017 - 10/2017, £60,600 as PI
  • HiDE: A Tool for Unrestricted Literature Based Discovery, Government, 01/2016 - 06/2016, £66,584 as PI
  • InPuT: Individual Profiling using Text Analysis, Government, 09/2014 - 09/2015, £10,746 as PI
  • Information Processing and Sensemaking: An Exploratory Search System for Document Collections, Government, 09/2014 - 08/2015, £77,840 as PI
  • Connected Marketplace, Industrial, 01/2014 - 08/2014, £5,000 as PI
  • PUMP: Developing a Data Set of Textual and Visual Topic Labels, EPSRC, 09/2013 - 10/2013, £1,540 as PI
  • Language Processing for Literature Based Discovery in Medicine, EPSRC, 06/2012 - 05/2015, £293,127 as PI
  • PATHS: Personalised Access to Cultural Heritage Spaces, EC FP7, 01/2011 - 12/2013, £709,407 as PI