Dr Carolina Scarton

BSc, MSc, PhD

Department of Computer Science

Lecturer in Natural Language Processing

Outreach, Open Days and Headstart Officer

Member of the Natural Language Processing research group

Caron Scarton profile photo
c.scarton@sheffield.ac.uk
+44 114 222 1892

Full contact details

Dr Carolina Scarton
Department of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
Profile

Carolina Scarton is a Lecturer in Natural Language Processing at the Department of Computer Science, University of Sheffield, UK. She is a member of the Natural Language Processing group and part of the GATE team.

Previously, she worked as an Academic Fellow (from September 2019 to November 2021) and as a Research Associate for the WeVerify (from March 2019 to August 2019) and SIMPATICO (from July 2016 to February 2019) European projects.

Qualifications

In 2017, she was awarded a PhD degree in Computer Science from the University of Sheffield, under the supervision of Professor Lucia Specia. Her PhD was funded by the EXPERT project (a Marie Curie ITN network).

She also has a MSc and a BSc degree from the University of São Paulo, Brazil (awarded in 2013).

Her MSc supervisor was Dr. Sandra Aluísio and she was a member of the Interinstitutional Center for Computational Linguistics (NILC). Since 2018, she is the Secretary for the European Association for Machine Translation (EAMT).

Research interests

Dr Scarton's research area is Natural Language Processing (NLP). She is particularly interested in text adaptation, machine translation, online misinformation detection and verification, evaluation of NLP task outputs, NLP applied to healthcare and robotics, and dialog systems.

Publications

Books

Journal articles

Book reviews

Conference proceedings papers

Theses / Dissertations

Preprints

  • Goldsack T, Zhang Z, Lin C & Scarton C (2022) Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature. RIS download Bibtex download
  • Li Y, Scarton C, Song X & Bontcheva K (2022) Classifying COVID-19 vaccine narratives, arXiv. RIS download Bibtex download
  • Singh I, Li Y, Thong M & Scarton C (2022) GateNLP-UShef at SemEval-2022 Task 8: Entity-Enriched Siamese Transformer for Multilingual News Article Similarity, arXiv. RIS download Bibtex download
  • Vincent ST, Barrault L & Scarton C (2022) Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022, arXiv. RIS download Bibtex download
  • Vincent ST, Barrault L & Scarton C (2022) Controlling Extra-Textual Attributes about Dialogue Participants: A Case Study of English-to-Polish Neural Machine Translation, arXiv. RIS download Bibtex download
  • Gow-Smith E, Madabushi HT, Scarton C & Villavicencio A (2022) Improving Tokenisation by Alternative Treatment of Spaces. RIS download Bibtex download
  • Madabushi HT, Gow-Smith E, Scarton C & Villavicencio A (2021) AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models, arXiv. RIS download Bibtex download
  • Singh I, Bontcheva K & Scarton C (2021) The False COVID-19 Narratives That Keep Being Debunked: A Spatiotemporal Analysis, arXiv. RIS download Bibtex download
  • Jiang Y, Song X, Scarton C, Aker A & Bontcheva K (2021) Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of COVID-19 Infodemic, arXiv. RIS download Bibtex download
  • Singh I, Scarton C & Bontcheva K (2021) Multistage BiCross encoder for multilingual access to COVID-19 health information, arXiv. RIS download Bibtex download
  • Leite JA, Silva DF, Bontcheva K & Scarton C (2020) Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis, arXiv. RIS download Bibtex download
  • Scarton C, Silva DF & Bontcheva K (2020) Measuring What Counts: The case of Rumour Stance Classification, arXiv. RIS download Bibtex download
  • Alva-Manchego F, Martin L, Bordes A, Scarton C, Sagot B & Specia L (2020) ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations, arXiv. RIS download Bibtex download
  • Scarton C, Forcada ML, Esplà-Gomis M & Specia L (2019) Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality, arXiv. RIS download Bibtex download
  • Alva-Manchego F, Martin L, Scarton C & Specia L (2019) EASSE: Easier Automatic Sentence Simplification Evaluation, arXiv. RIS download Bibtex download
  • Forcada ML, Scarton C, Specia L, Haddow B & Birch A (2018) Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting, arXiv. RIS download Bibtex download
  • Jiang Y, Song X, Scarton C, Singh I, Aker A & Bontcheva K () Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of the COVID-19 Infodemic, Research Square Platform LLC. RIS download Bibtex download
Grants

Current Grants