Dr Diana Maynard
School of Computer Science
Senior Research Fellow
Deputy Head of the Natural Language Processing research group
 
   
  
    
         d.maynard@sheffield.ac.uk
    
          
          
        
      
    
  
  
      Regent Court (DCS)
  
Full contact details
        Dr Diana Maynard
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
          
      
  
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
- Research interests
- 
    - Information extraction
- GATE
- Social media analysis
- Sentiment analysis
- Online abuse and misinformation detection
- Term recognition
- Ontologies and semantic web
- Freedom of the media
- NLP for scientometrics
 
- Publications
- 
    Books-   The Chilling: A global study of online violence against women journalists. ICFJ. 
					    
-   Natural Language Processing for the Semantic Web. Springer International Publishing. 
					    
-   Natural Language Processing for the Semantic Web. Morgan & Claypool Publishers. 
					    
-   Text Processing with Gate (Version 6). GATE. 
					    
-   Preface. 
					    
-   Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface. 
					    
-   Preface. 
					    
-   Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface. 
					    
 Journal articles-  Cross-modal augmentation for few-shot multimodal fake news detection. Engineering Applications of Artificial Intelligence, 142, 109931-109931. 
					    
-  Similarity-aware multimodal prompt learning for fake news detection. Information Sciences, 647. View this article in WRRO 
					    
-  Using natural language processing and artificial intelligence to explore the nutrition and sustainability of recipes and food. Frontiers in Artificial Intelligence, 3. 
					    
-  Classification aware neural topic model for COVID-19 disinformation categorisation. PLoS ONE, 16(2). View this article in WRRO 
					    
-  Using ontologies to map between research data and policymakers’ presumptions: the experience of the KNOWMAK project. Scientometrics, 125(2), 1275-1290. View this article in WRRO 
					    
-  Strengthening the monitoring of violations against journalists through an events-based methodology. Media and Communication, 8(1), 89-100. View this article in WRRO 
					    
-  What matters most to people around the world? Retrieving Better Life Index priorities on Twitter. Technological Forecasting and Social Change, 137, 61-75. View this article in WRRO 
					    
-  Pro-environmental campaigns via social media: analysing awareness and behaviour patterns. Journal of Web Science, 3(1). View this article in WRRO 
					    
-  A framework for real-time semantic social media analysis. Journal of Web Semantics, 44, 75-88. View this article in WRRO 
					    
-  Distantly Supervised Web Relation Extraction for Knowledge Base Population. Semantic Web Journal. View this article in WRRO 
					    
-  Entity-Based Opinion Mining from Text and Multimedia, 65-86. 
					    
-  Analysis of named entity recognition and linking for tweets. Information Processing & Management, 51(2), 32-49. 
					    
-  Interlinking Documents Based on Semantic Graphs with an Application, 139-155. 
					    
-  The semantic web challenge 2012. Journal of Web Semantics, 24, 1-2. 
					    
-  The Semantic Web Challenge, 2011. Journal of Web Semantics. 
					    
-  Automatic detection of political opinions in tweets. Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 7117 LNCS, 88-99. 
					    
-  Automatic detection of political opinions in tweets. Ceur Workshop Proceedings, 718, 81-92. 
					    
-  The semantic web challenge, 2010. Journal of Web Semantics, 9(3), 315. 
					    
-  Using lexico-syntactic ontology design patterns for ontology creation and population. Ceur Workshop Proceedings, 516, 39-52. 
					    
-  NLP-based support for ontology lifecycle development. Ceur Workshop Proceedings, 514. 
					    
-  Information extraction: Algorithms and prospects in a retrieval context. COMPUT LINGUIST, 34(2), 315-317. 
					    
-  NLP techniques for term extraction and ontology population. Frontiers in Artificial Intelligence and Applications, 167(1), 107-127. 
					    
-  REASE - The repository for learning units about the Semantic Web. New Review of Hypermedia and Multimedia, 13(2), 211-237. 
					    
-  Preface.. IBM Syst. J., 45, 3-6. 
					    
-  Corpus Linguistics and South Asian Languages: Corpus Creation and Tool Development. Literary and Linguistic Computing, 19(4), 509-524. 
					    
-  Evolving GATE to meet new challenges in language engineering. Natural Language Engineering, 10(3-4), 349-373. 
					    
-  Architectural Elements of Language Engineering Robustness. Journal of Natural Language Engineering, 8(2-3), 257-274. 
					    
-  TRUCKS: A Model for Automatic Multi-Word Term Recognition.. Journal of Natural Language Processing, 8(1), 101-125. 
					    
-  Similarity-Aware Multimodal Prompt Learning for Fake News Detection. SSRN Electronic Journal. 
					    
-  Should I Care about Your Opinion? Detection of Opinion Interestingness and Dynamics in Social Media. Future Internet, 6(3), 457-481. 
					    
-  Analysing and Enriching Focused Semantic Web Archives for Parliament Applications. Future Internet, 6(3), 433-456. 
					    
 Book chapters-  Language Report English, Cognitive Technologies (pp. 127-130). Springer International Publishing 
					    
-  Best Practice for Forensic Fishing: Combining Text Processing with an Environmental History View of Historic Travel Writing in Loch Lomond, Scotland, Unlocking Environmental Narratives: Towards Understanding Human Environment Interactions through Computational Text Analysis (pp. 133-160). Ubiquity Press 
					    
-  Preface (pp. V-VII). 
					    
-  Challenges in Analysing Social Media. In Dusa A, Nelle D, Stock G & Wagner G (Ed.), Facing the Future: European Research Infrastructures for the Humanities and Social Sciences Berlin: SCIVERO Verlag. 
					    
-  Natural language processing, Perspectives on Ontology Learning (pp. 51-67). 
					    
-  Documenting Contemporary Society by Preserving Relevant Information from Twitter In Weller K, Bruns A, Burgess J, Mahrt M & Puschmann C (Ed.), Twitter and Society USA: Peter Lang. 
					    
-  Term extraction using a similarity-based approach, Natural Language Processing (pp. 261-278). John Benjamins Publishing Company 
					    
 Conference proceedings-  Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference. Nlp4dh 2024 4th International Conference on Natural Language Processing for Digital Humanities Proceedings of the Conference (pp 450-462) 
					    
-  Dimensions of online conflict: towards modeling agonism. Findings of the Association for Computational Linguistics: EMNLP 2023 (pp 12194-12209). Singapore, 6 December 2023 - 6 December 2023. View this article in WRRO 
					    
-  Development of a benchmark corpus to support entity recognition in job descriptions. Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp 1201-1208). Marseille, France View this article in WRRO 
					    
-  Combining expert knowledge with NLP for specialised applications. Text, Speech, and Dialogue: 23rd International Conference on Text, Speech and Dialogue (TSD 2020), Vol. 12284 (pp 3-10). Brno, Czech Republic, 8 September 2020 - 8 September 2020. View this article in WRRO 
					    
-  Comparing topic-aware neural networks for bias detection of news. Proceedings of 24th European Conference on Artificial Intelligence (ECAI 2020), Vol. 325 (pp 2054-2061). Santiago de Compostela, Spain, 29 August 2020 - 29 August 2020. View this article in WRRO 
					    
-  Using ontologies to map between research and policy data: opportunities and challenges. Proceedings of the 17th International Conference on Scientometrics & Informetrics, Vol. 1 (pp 535-540). Rome, Italy, 2 September 2019 - 2 September 2019. View this article in WRRO 
					    
-  Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network. Proceedings of the 13th International Workshop on Semantic Evaluation (pp 840-844). Minneapolis, Minnesota, USA, 6 July 2019 - 6 July 2019. View this article in WRRO 
					    
-  Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network. Proceedings of the 13th International Workshop on Semantic Evaluation, June 2019 - June 2019. 
					    
-  Exploring knowledge production in Europe. The KNOWMAK tool. Proceedings of the 17th Conference of the International Society for Scientometrics and Informetrics (ISSI 2019), Vol. II (pp 2561-2562). Rome, Italy, 2 September 2019 - 2 September 2019. View this article in WRRO 
					    
-  Adapted TextRank for Term Extraction: a generic method of improving automatic term extraction algorithms. Procedia Computer Science, Vol. 137 (pp 102-108). Vienna, Austria, 10 September 2018 - 10 September 2018. View this article in WRRO 
					    
-  Twits, Twats and Twaddle: Trends in Online Abuse towards UK Politicians. Proceedings Of The Twelfth International Conference On Web And Social Media (pp 600-603). California, USA, 25 June 2018 - 25 June 2018. View this article in WRRO 
					    
-  Helping crisis responders find the informative needle in the tweet haystack. Proceedings of the 15th ISCRAM Conference (pp 649-662). Rochester, NY, USA, 20 May 2018 - 20 May 2018. View this article in WRRO 
					    
-  Ontologies as bridges between data sources and user queries: the KNOWMAK project experience. Proceedings of Science, Technology and Innovation indicators 2017. Paris, 6 September 2017 - 6 September 2017. View this article in WRRO 
					    
-  Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation.. Proc. of EMNLP Workshop "Natural Language Meets Journalism" 
					    
-  Towards an infrastructure for understanding and interlinking knowledge co-creation in European research. CEUR Workshop Proceedings, Vol. 1878. Portoroz, Slovenia View this article in WRRO 
					    
-  Preface. The semantic web: 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 – June 1, 2017, Proceedings, Part II, Vol. 10250  (pp V-VII). Portorož, Slovenia, 28 May 2017 - 28 May 2017. View this article in WRRO 
					    
-  GATE-time: Extraction of temporal expressions and events. Proceedings of the 10th International Conference on Language Resources and Evaluation Lrec 2016 (pp 3702-3708) 
					    
-  Talking climate change via social media. Proceedings of the 8th ACM Conference on Web Science (pp 85-94) 
					    
-  Challenges of Evaluating Sentiment Analysis Tools on Social Media. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp 1142-1148). Portorož, 23 May 2016 - 23 May 2016. View this article in WRRO 
					    
-  Automated Content Analysis. Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication (pp 1-6) 
					    
-  Extracting Relations between Non-Standard Entities using Distant Supervision and Imitation Learning. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp 747-757). Lisbon, Portugal, 17 September 2015 - 17 September 2015. View this article in WRRO 
					    
-  Real-time Social Media Analytics through Semantic Annotation and Linked Open Data. Proceedings of the ACM Web Science Conference (pp 1-2) 
					    
-  Relation Extraction from the Web Using Distant Supervision (pp 26-41) 
					    
-  Introduction. Swaie 2014 3rd Workshop on Semanticweb and Information Extraction Proceedings of the Workshop (pp III) 
					    
-  Microblog-genre noise and impact on semantic annotation accuracy. Ht 2013 Proceedings of the 24th ACM Conference on Hypertext and Social Media (pp 21-30) 
					    
-  TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text. Proceedings of the International Conference on Recent Advances in Natural Language Processing 
					    
-  Interlinking documents based on semantic graphs. Procedia Computer Science, Vol. 22 (pp 231-240) 
					    
-  Multimodal sentiment analysis of social media. Ceur Workshop Proceedings, Vol. 1110 (pp 47-58) 
					    
-  Entity extraction and consolidation for social web content preservation. Ceur Workshop Proceedings, Vol. 912 (pp 18-29) 
					    
-  Knowledge extraction and consolidation from social media (KECSM 2012) :Preface. Ceur Workshop Proceedings, Vol. 895 (pp I-II) 
					    
-  Large Scale Semantic Annotation, Indexing, and Search at The National Archives. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp 3487-3494) 
					    
-  Using events for content appraisal and selection in Web archives. Ceur Workshop Proceedings, Vol. 779 (pp 98-107) 
					    
-  Motivating intelligent email in business: An investigation into current trends for email processing and communication research. 2009 IEEE Conference on Commerce and Enterprise Computing CEC 2009 (pp 476-482) 
					    
-  Evaluating Evaluation Metrics for Ontology-Based Applications: Infinite Reflection.. LREC 
					    
-  Benchmarking Textual Annotation Tools for the Semantic Web. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 (pp 20-25) 
					    
-  Ontology-based information extraction for business intelligence. SEMANTIC WEB, PROCEEDINGS, Vol. 4825 (pp 843-856) 
					    
-  Natural language technology for information integration in business intelligence. BUSINESS INFORMATION SYSTEMS, PROCEEDINGS, Vol. 4439 (pp 366-380) 
					    
-  Metrics for evaluation of ontology-based information extraction. Eon 2006 Evaluation of Ontologies for the Web 4th International Workshop Located at the 15th International World Wide Web Conference Www 2006 
					    
-  Metrics for evaluation of ontology-based information extraction. Ceur Workshop Proceedings, Vol. 179 
					    
-  Creating tools for morphological analysis of sumerian. Proceedings of the 5th International Conference on Language Resources and Evaluation Lrec 2006 (pp 1762-1765) 
					    
-  Ontology-based information extraction for market monitoring and technology watch. Ceur Workshop Proceedings, Vol. 137 (pp 33-42) 
					    
-  Extracting a domain ontology from linguistic resource based on relatedness measurements. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings (pp 345-351) 
					    
-  A lightweight approach to coreference resolution for named entities in text. Anaphora Processing, Vol. 263 (pp 97-111) 
					    
-  Using parallel texts to improve recall in botany. Recent Advances in Natural Language Processing III, Vol. 260 (pp 237-246) 
					    
-  Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project. DATA & KNOWLEDGE ENGINEERING, Vol. 48(2) (pp 247-264) 
					    
-  Populating a database from parallel texts using ontology-based information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 3136 (pp 254-264) 
					    
-  Automatic language-independent induction of gazetteer lists. Proceedings of the 4th International Conference on Language Resources and Evaluation Lrec 2004 (pp 709-712) 
					    
-  Creation of reusable components and language resources for Named Entity Recognition in Russian. Proceedings of the 4th International Conference on Language Resources and Evaluation Lrec 2004 (pp 309-312) 
					    
-  Automatic creation and monitoring of semantic metadata in a dynamic knowledge portal. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, Vol. 3192 (pp 65-74) 
					    
-  Rapid customization of an information extraction system for a surprise language.. ACM Trans. Asian Lang. Inf. Process., Vol. 2 (pp 295-300) 
					    
-  Multilingual adaptations of ANNIE, a reusable information extraction tool. Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics  - EACL '03, Vol. 2 (pp 219-219), 12 April 2003 - 17 April 2003. 
					    
-  NE  recognition without training data on a language you don’t speak. ACL Workshop on Multilingual and Mixed-language Named Entity Recognition: Combining Statistical and Symbolic Models. Sapporo, Japan 
					    
-  GATE: A Unicode-based Infrastructure Supporting Multilingual Information Extraction. Proceedings of Workshop on Information Extraction for Slavonic and other Central and Eastern European Languages (IESL’03). Borovets, Bulgaria 
					    
-  OLLIE. Proceedings of the HLT-NAACL 2003 workshop on Software engineering and architecture of language technology systems  - SEALTS '03, Vol. 8 (pp 17-24), 31 May 2003 - 31 May 2003. 
					    
-  Experiments with geographic knowledge for information extraction. Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references  -, Vol. 1 (pp 1-9), 31 May 2003. 
					    
-  Access to multimedia information through multisource and multilanguage information extraction. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, Vol. 2553 (pp 160-171) 
					    
-  Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content (pp 613-625) 
					    
-  Adapting a robust multi-genre NE system for automatic content extraction. ARTIFICIAL INTELLIGENCE:  METHODOLOGY, SYSTEMS AND APPLICATIONS, PROCEEDINGS, Vol. 2443 (pp 264-273) 
					    
-  GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02). Philadelphia, USA 
					    
-  A framework and graphical development environment for robust NLP tools and applications.. ACL (pp 168-175) 
					    
-  A unicode-based environment for creation and use of language resources. Proceedings of the 3rd International Conference on Language Resources and Evaluation Lrec 2002 (pp 66-71) 
					    
-  Using GATE as an environment for teaching NLP. Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics  -, Vol. 1 (pp 54-62), 7 July 2002 - 7 July 2002. 
					    
-  Extracting information for automatic indexing of multimedia material. Proceedings of the 3rd International Conference on Language Resources and Evaluation Lrec 2002 (pp 669-676) 
					    
-  How feasible is the reuse of grammars for Named Entity Recognition?. Proceedings of the 3rd International Conference on Language Resources and Evaluation Lrec 2002 (pp 1412-1418) 
					    
-  Using a text engineering framework to build an extendable and portable IE-based summarisation system. Proceedings of the ACL-02 Workshop on Automatic Summarization  -, Vol. 4 (pp 19-26), 11 July 2002 - 12 July 2002. 
					    
-  GATE: an architecture for development of robust HLT applications. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp 168-175) 
					    
-  Developing reusable and robust language processing components for information systems using GATE. 13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS (pp 223-227) 
					    
-  Named Entity Recognition from Diverse Text Types. Recent Advances in Natural Language Processing 2001 Conference (pp 257-274-257-274). Tzigov Chark, Bulgaria 
					    
-  GATE. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics  - ACL '02 (pp 168-168), 7 July 2002 - 12 July 2002. 
					    
-  Experience of using GATE for NLP R&D. Proceedings of the Workshop on Using Toolsets and Architectures To Build NLP Systems at COLING-2000. Luxembourg 
					    
-  Creating and using domain-specific ontologies for terminological applications. 2nd International Conference on Language Resources and Evaluation Lrec 2000 
					    
-  Identifying terms by their family and friends. Proceedings of the 18th conference on Computational linguistics  -, Vol. 1 (pp 530-536), 31 July 2000 - 4 August 2000. 
					    
-  Cross-lingual classification of crisis data. The Semantic Web – ISWC 2018, Vol. 11136 (pp 617-633). Monterey, CA, USA, 8 October 2018 - 8 October 2018. View this article in WRRO 
					    
-  Understanding climate change tweets: an open source toolkit for social media analysis. Advances in Computer Science Research, 7 September 2015 - 9 September 2015. 
					    
-  Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis. LREC 2014 Proceedings. Reykjavik, Iceland, 26 May 2014 - 26 May 2014. View this article in WRRO 
					    
 DatasetsPreprints-  Hostility Detection in UK Politics: A Dataset on Online Abuse Targeting MPs, arXiv. 
					    
-  Dimensions of Online Conflict: Towards Modeling Agonism, arXiv. 
					    
-  Examining Temporal Bias in Abusive Language Detection.. 
					    
-  Similarity-Aware Multimodal Prompt Learning for Fake News Detection, arXiv. 
					    
-  Helping Crisis Responders Find the Informative Needle in the Tweet Haystack, arXiv. 
					    
-  Analysis of Named Entity Recognition and Linking for Tweets, arXiv. 
					    
-  A Framework for Real-Time Semantic Social Media Analysis. 
					    
-  Editorial - Semantic Web Challange, 2010. 
					    
 
-   The Chilling: A global study of online violence against women journalists. ICFJ. 
					
- Research group
- 
    Member of the Natural Language Processing research group. 
- Grants
- 
    Current grants- Influencing policy work on human rights violations against journalists, Research England, 09/2024 - 06/2025, £34,667, as PI
- Toolkit for Analysing and Visualising Online Violence Against Female Journalists, EPSRC, 04/2024 - 03/2025, £45,363, as PI
- Atrium: Advancing FronTier Research In the Arts and hUManities, Horizon Europe, 01/2024 - 12/2027, £370,950, as PI
 Previous grants- RISIS2: European Research Infrastructure for Science, technology and Innovation policy Studies 2, EC H2020, 01/2019 - 12/2022, £476,741, as co-PI
- Visualising the environmental impacts of plant-based recipes in Europe, Research England, 12/2021 - 05/2022, £18,407, as PI
- Calculating the environmental impact of plant based recipes, Industrial, 01/2021 - 12/2021, £2,500, as PI
- Pilot project on developing and trialling a toolkit for strengthening national context monitoring of violations against journalists, Free Press, 06/2020 - 12/2020, £29,094, as Co-PI
- Pilot project on developing a database for the improved collection and systematisation of information on incidents of violations against journalists, Free Press, 04/2019 - 11/2019, £29,030, as Co-I
- The Intelligent Automation of Contract Analysis of Collateral Warranties, Innovate UK, 03/2019 - 08/2020, £114,552, as PI
- Social Understandings of Scale: The role of Print and Social Media in the EU Referendum Debate, British Academy, 01/2018 - 06/2019, £49,716, as Co-PI
- Improving the monitoring of violence against journalists, Free Press, 12/2017 - 10/2018, £26,589, as Co-I
- KNOWMAK: Knowledge in the making in the European society, EC H2020, 01/2017 - 12/2019, £196,654, as PI
- COMRADES: Collective Platform for Community Resilience and Social Innovation during Crises, EC H2020, 01/2016 - 12/2018, £257,000, as PI