webASR 2 — Improved Cloud Based Speech Technology

Thomas Hain, Jeremy Christian, Oscar Saz, Salil Deena, Madina Hasan, Raymond W.M. Ng, Rosanna Milner, Mortaza Doulaty, Yulan Liu

INTERSPEECH 2016, September 8–12, 2016, San Francisco, USA

This paper presents the most recent developments of the webASR service (www.webasr.org), the world’s first web-based fully functioning automatic speech recognition platform for scientific use. Initially released in 2008, the functionalities of webASR have recently been expanded with 3 main goals in mind: Facilitate access through a RESTful architecture, that allows for easy use through either the web interface or an API; allow the use of input metadata when available by the user to improve system performance; and increase the coverage of available systems beyond speech recognition. Several new systems for transcription, diarisation, lightly supervised alignment and translation are currently available through webASR. The results in a series of well-known benchmarks (RT’09, IWSLT’12 and MGB’15 evaluations) show how these webASR systems provides state-of-the-art performances across these tasks.