Professor Thomas Hain

School of Computer Science

Professor of Speech and Audio Technology

Director of CDT in Speech and Language Technologies

Director of Liveperson Centre

Member of the Speech and Hearing (SpandH) research group

Thomas Hain profile photo
Profile picture of Thomas Hain profile photo
t.hain@sheffield.ac.uk

Full contact details

Professor Thomas Hain
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
Profile

Thomas Hain obtained the degree 'Dipl.-Ing' in Electrical/Communication Engineering in 1994 from the University of Technology, Vienna. He joined the Speech Technology Group at Philips Speech Processing which he left in a senior position.

In 1997 he joined the Speech, Vision and Robotics Group at the Cambridge University Engineering Department as Research Associate and PhD Student. He took up a Lectureship at the SVR group in 2001.

In 2004 he joined the Speech and Hearing Group to work as Lecturer in Computer Science. He was promoted to Senior Lecturer in 2008 and Reader in 2011.

Research interests

Thomas' research interests cover many areas in natural language processing, speech, audio and multimedia technology, machine learning, and complex system optimisation and design.

His interests include: large vocabulary continuous speech recognition, non-linear methods in speech processing, low bit-rate speech coding, machine learning, multi-modal systems, image classification, microphone arrays, system and resource optimisation.

Publications

Books

  • Young SJ, Evermann G, Gales MJF, Hain T, Kershaw D, Moore GL, Odell JJ, Ollason D, Povey D, Valtchev V & Woodland PC (2004) The HTK Book. Cambridge, England: Cambridge University Engineering Department. RIS download Bibtex download
  • Young S, Evermann G, Gales M, Hain T, Kershaw D, Xunying L, Moore G, Odell J, Ollason D, Povey D , Ragni A et al () The HTK Book (for HTK Version 3.5, documentation alpha version). Cambridge University Engineering Department: Cambridge University Engineering Department. RIS download Bibtex download

Journal articles

Book chapters

Conference proceedings

Reports

  • Close G, Hollands S, Goetze S & Hain T (2022) Clarity Prediction Challenge 1 Entry: Non-intrusive Speech Intelligibility Metric Prediction - Technical Report RIS download Bibtex download
  • el Hannani A & Hain T (2011) Data Dependence of Speech Decoder Parameters RIS download Bibtex download
  • Gibson M & Hain T (2011) Confidence-informed unsupervised Minimum Bayes Risk acoustic model adaptation RIS download Bibtex download
  • Hain T, Dines J & McCowan I (2006) Conversational multi-party speech recognition using remote microphones RIS download Bibtex download
  • Hain T, Woodland PC, Evermann G, Liu X, Moore GL, Povey D & Wang L (2003) Automatic Transcription of Conversational Telephone Speech. Development of the CU-HTK 2002 System RIS download Bibtex download

Theses

  • Hain T (2001) Hidden Model Sequence Models for Automatic Speech Recognition. RIS download Bibtex download
  • Hain T (1993) On the Use of Iterated Function Systems for Coding of Grayscale Images. RIS download Bibtex download

Datasets

Other

Preprints

Grants
  • UKRI Centre for Doctoral Training in Speech and Language Technologies and their Applications, EPSRC, 04/2019 - 09/2027, £5,508,850, as PI
  • VoiceBase Centre, VoiceBase Inc./Liveperson, 04/2018 - 03/2026, £2,488,691, as PI
  • WFST-based integration of ASR and MT in Spoken Language Translation, Industrial, 03/2014 - 12/2026, £63,588, as PI
  • Automatic voice conversion for transforming professional adult voice actors to artificial child voice actors, Innovate UK, 01/2021 - 01/2023, £173,605, as PI
  • MAUDIE: Multimedia Analysis for Unsupervised Dubbing In Entertainment, Innovate UK, 05/2018 - 07/2021, £393,115, as PI
  • TUTO II: Reading skills tutoring system, ITSLANGUAGE BV, 08/2017 - 12/2019, £121,439, as PI
  • Sound Source Separation Based on Deep Learning, Industrial, 05/2019 - 04/2020, £48,000, as PI
  • Acoustic correlates of emotions for automatic recognition, Industrial, 10/2018 - 09/2019, £48,900, as PI
  • Bridge Project, VoiceBase Inc., 09/2017 - 03/2018, £61,200, as PI
  • STATUS IV: Speech Technology and Translation Universal Survey, Defence Science and Technology Laboratory, 01/2017 - 10/2017, £60,000, as PI
  • TUTO: Reading skills tutoring system, ITSLANGUAGE BV, 09/2016 - 08/2017, £61,983, as PI
  • STATUS III: Speech Technology and Translation Universal Survey, Defence Science and Technology Laboratory, 01/2015 - 07/2016, £78,684, as PI
  • STATUS II: Speech Technology and Translation Universal Survey, Defence Science and Technology Laboratory, 11/2013 - 05/2014, £98,982, as PI
  • ItsLanguage, ITSLANGUAGE BV, 11/2012 - 03/2015, £68,333, as PI
  • German System Adaptation, ITSLANGUAGE BV, 11/2012 - 03/2015, £42,373, as PI
  • DocuMeet: Transcription, summarisation and documentation of meetings using advanced speech technologies, indexing and browsing capabilities, EC FP7, 11/2012 - 10/2014, £368,433, as PI
  • STATUS: Speech Technology and Translation Universal Survey, Defence Science and Technology Laboratory, 10/2012 - 08/2013, £73,726, as PI
  • A Joint Model of Spoken Language Translation, Google, 09/2011 - 12/2016, £43,014, as PI
  • Natural Speech Technology, EPSRC, 05/2011 - 07/2016, £1,798,665, as PI
  • Unsupervised Domain Adaptation, CISCO, 11/2010 - 04/2012, £121,745, as PI
  • AMIDA: Augmented Multi-party Interaction with Distance Access, EC FP6, 10/2006 - 12/2009, £467,074, as PI
  • AMIDA: Augmented Multi-party Interaction with Distance Access, EC FP6, 10/2006 - 12/2009, £345,350, as PI
Professional activities and memberships