Dr Nikos Aletras
Department of Computer Science
Lecturer in Natural Language Processing
Careers and Placements Officer
Member of the Natural Language Processing (NLP) research group
+44 114 222 1911
Full contact details
Department of Computer Science
Regent Court (DCS)
Nikos Aletras is a Lecturer in Natural Language Processing (NLP) in the Computer Science Department at the University of Sheffield, co-affiliated with the Machine Learning (ML) group. Previously, he was a research scientist at Amazon (Core ML and Alexa) and a research associate at UCL, Department of Computer Science, Media Futures Group. He completed a PhD in NLP at the University of Sheffield. His research interests are in NLP, Machine Learning and Data Science. He develops text analysis methods to solve problems in other scientific areas such as (computational) social and legal science.
- Research interests
- Computational Social Science
- Legal NLP
- Data Science
- Machine Learning
- Unsupervised quality estimation for neural machine translation. Transactions of the Association for Computational Linguistics, 8, 539-555. View this article in WRRO
- Analyzing Political Parody in Social Media. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. View this article in WRRO
- Evaluating topic representations for exploring document collections. Journal of the Association for Information Science and Technology, 68(1), 154-167. View this article in WRRO
- Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective. PeerJ Computer Science, 2. View this article in WRRO
- Why are these similar? Investigating item similarity types in a large digital library. Journal of the Association for Information Science and Technology, 67(7), 1624-1638. View this article in WRRO
- Computing similarity between items in a digital library of cultural heritage. Journal of Computing and Cultural Heritage, 5(4).
- View this article in WRRO Extreme Multi-Label Legal Text Classification: A case study in EU Legislation.
- Studying User Income through Language, Behaviour and Affect in Social Media. PLoS ONE, 10(9).
- Complaint Identification in Social Media with Transformer Networks.
- Point-of-Interest Type Inference from Social Media Text.
- An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels.
- LEGAL-BERT: The Muppets straight out of Law School.
Conference proceedings papers
- Quality In, Quality Out: Learning from Actual Mistakes.. EAMT (pp 145-153)
- Analyzing Political Parody in Social Media.. ACL (pp 4373-4384)
- Proceedings of the Natural Legal Language Processing Workshop 2020 co-located with the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2020), Virtual Workshop, August 24, 2020.. NLLP@KDD, Vol. 2645
- Automatic Generation of Topic Labels.. SIGIR (pp 1965-1968)
- Introduction to the nllp 2020workshop. CEUR Workshop Proceedings, Vol. 2645
- LEGAL-BERT: "Preparing the Muppets for Court'".. EMNLP (Findings) (pp 2898-2904)
- Automatic Generation of Topic Labels. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval View this article in WRRO
- Extreme Multi-Label Legal Text Classification: A Case Study in. Proceedings of the Natural Legal Language Processing Workshop 2019, June 2019 - June 2019.
- Neural Legal Judgment Prediction in English. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp 4317-4323). Florence, Italy, 28 July 2019 - 2 August 2019. View this article in WRRO
- Automatically identifying complaints in social media. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp 5008-5019). Florence, Italy, 28 July 2019 - 2 August 2019. View this article in WRRO
- Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, November 2019 - November 2019.
- Automatically Identifying Complaints in Social Media.. ACL (1) (pp 5008-5019)
- Nowcasting the Stance of Social Media Users in a Sudden Vote: The Case of the Greek Referendum. CIKM '18 Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp 367-376), 22 October 2018 - 26 October 2018. View this article in WRRO
- Predicting Twitter User Socioeconomic Attributes with Network and Language Information. Proceedings of the 29th ACM Conference on Hypertext and Social Media (pp 20-24), 9 July 2018 - 12 July 2018. View this article in WRRO
- Nowcasting the Stance of Social Media Users in a Sudden Vote: The Case of the Greek Referendum.. CIKM (pp 367-376)
- Multimodal Topic Labelling. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, April 2017 - April 2017.
- Labeling topics with images using a neural network. Advances in Information Retrieval : 39th European Conference on IR Research, ECIR 2017, Aberdeen, UK, April 8-13, 2017, Proceedings (pp 500-505). Aberdeen, UK, 8 April 2017 - 13 April 2017.
- Inferring the Socioeconomic Status of Social Media Users Based on Behaviour and Language (pp 689-695)
- TM 2015 -- Topic Models. Proceedings of the 24th ACM International on Conference on Information and Knowledge Management - CIKM '15, 18 October 2015 - 23 October 2015.
- A Hybrid Distributional and Knowledge-based Model of Lexical Semantics. Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, June 2015 - June 2015.
- An analysis of the user occupational class through Twitter content. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (pp 1754-1764)
- Proceedings of the 2015 Workshop on Topic Models: Post-Processing and Applications, TM 2015, Melbourne, Australia, October 19, 2015. TM@CIKM
- Labelling Topics using Unsupervised Graph-based Methods. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Vol. 2 (pp 631-636)
- Measuring the Similarity between Automatically Generated Topics. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, April 2014 - April 2014.
- Representing topics labels for exploring digital libraries. IEEE/ACM Joint Conference on Digital Libraries, 8 September 2014 - 12 September 2014.
- Predicting and Characterising User Impact on Twitter. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
- PATHS: A System for Accessing Cultural Heritage Collections.. ACL (Conference System Demonstrations) (pp 151-156)
- UBC UOS-TYPED: Regression for Typed-similarity. *SEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics, Vol. 1 (pp 132-137)
- Evaluating topic coherence using distributional semantics. Proceedings of the 10th International Conference on Computational Semantics, IWCS 2013 - Long Papers
- Representing topics using images. NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference (pp 158-167)
- PATHS - Exploring Digital Cultural Heritage Spaces. Theory and Practice of Digital Libraries 2012. Cyprus
- Computing Similarity between Cultural Heritage Items using Multimodal Features. Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp 85-93). Avignon, France
- User-centred design to support exploration and path creation in cultural heritage collections. CEUR Workshop Proceedings, Vol. 909 (pp 75-78)
- Responsible AI for Inclusive, Democratic Societies: A cross-disciplinary approach to detecting and countering abusive language online, ESRC, 02/2020 - 01/2023, £508,135, as Co-PI
- UKRI Centre for Doctoral Training in Speech and Language Technologies and their Applications, EPSRC, 04/2019 - 09/2027, £5,508,850, as Co-PI
- Bergamot: Browser-based Multilingual Translation, EC H2020, 01/2019 - 12/2021, £473,113, as Co-PI
- Innovation Next Generation Services Through Collaborative Design, ESRC, 12/2018 - 11/2020, £284,926, as Co-PI
- Journalist-in-the-Loop Machine Learning as a Service for Rumour Analysis, Google, 11/2018 - 12/2019, £44,642, as Co-PI
- Alexa Fellowship, Amazon, 08/2018 - 08/2021, £73,000, as PI