Dr Yoshi Gotoh

PhD

Department of Computer Science

Lecturer

Student Projects Officer

Member of the Speech and Hearing (SpandH) research group

y.gotoh@sheffield.ac.uk
+44 114 222 1908

Full contact details

Dr Yoshi Gotoh
Department of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
Profile

Yoshi is a lecturer in the Department of Computer Science. He has a first degree in Engineering form the University of Tokyo and a PhD from Brown University.

Research interests

Yoshi has been working in the field of speech and spoken language processing for years. His current interests include audio visual processing, in particular, video analysis and video information retrieval.

Publications

Journal articles

Conference proceedings papers

  • Algadhy R, Gotoh Y & Maddock S (2019) 3D visual speech animation using 2D videos. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp 2367-2371). Brighton, 12 May 2019 - 17 May 2019. View this article in WRRO RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2019) Graph-based correlated topic model for motion patterns analysis in crowded scenes from tracklets. British Machine Vision Conference 2018, BMVC 2018 RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2019) Graph-based correlated topic model for motion patterns analysis in crowded scenes from tracklets. British Machine Vision Conference 2018, BMVC 2018 RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2018) Graph-based correlated topic model for trajectory clustering in crowded videos. IEEE Winter Conference on Applications of Computer Vision (pp 1029-1037), 12 March 2018 - 14 March 2018. View this article in WRRO RIS download Bibtex download
  • Khan MUG, Gotoh Y & Nida N (2017) Medical image colorization for better visualization and segmentation. Medical Image Understanding and Analysis, Vol. 723 (pp 571-580) View this article in WRRO RIS download Bibtex download
  • AlHarbi N & Gotoh Y (2016) Natural language descriptions of human activities scenes: corpus generation and analysis. 5th Workshop on Vision and Language. Berlin RIS download Bibtex download
  • Algadhy R, Gotoh Y & Maddock S (2016) Analysis of visemes in the GRID corpus. Abstract of UKspeech RIS download Bibtex download
  • Masrani A & Gotoh Y (2016) Overlapped interest and the impact of visual and audio information in the human perception. Abstract of UKspeech RIS download Bibtex download
  • Wahla SQ, Waqar S, Ghani Khan MU & Gotoh Y (2016) The University of Sheffield and University of Engineering & Technology, Lahore at TRECVID 2016: Video to text description task. 2016 TREC Video Retrieval Evaluation, TRECVID 2016 RIS download Bibtex download
  • Alvi M, Khan M, Gotoh Y, Sadiq M & Aslam M (2015) University of Engineering & Technology, Lahore and The University of Sheffield at TRECVID 2015: instance search. TREC Video Retrieval Evaluation Workshop RIS download Bibtex download
  • Masrani A & Gotoh Y (2015) Corpus generation and analysis: incorporating audio data towards curbing missing information. Proceedings of KDWEB RIS download Bibtex download
  • Al Harbi N & Gotoh Y (2015) Describing spatio-temporal relations between object volumes in video streams. AAAI Workshop - Technical Report, Vol. WS-15-14 (pp 2-8) RIS download Bibtex download
  • Amanat S, Khan M, Nida N & Gotoh Y (2014) The University of Sheffield and University of Engineering & Technology, Lahore at TRECVID 2014: instance search task. TREC Video Retrieval Evaluation Workshop RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2014) Manifold matching with application to instance search based on video queries. ICISP. Cherbourg, 30 June 2014. RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2014) Alignment of nearly-repetitive contents in a video stream with manifold embedding. ICASSP. Firenze RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2014) Video clip retrieval by graph matching. ECIR. Amsterdam RIS download Bibtex download
  • Al Harbi N & Gotoh Y (2013) Spatio-temporal human body segmentation from video stream. CAIP. York RIS download Bibtex download
  • Al Harbi N & Gotoh Y (2013) Action recognition: spatio-temporal human body region tracking approach. CAIP - REACTS workshop. York RIS download Bibtex download
  • Al Ghamdi M & Gotoh Y (2013) Spatio-temporal manifold embedding for nearly-repetitive contents in a video stream. CAIP. York RIS download Bibtex download
  • Khan M, Bashir K, Shah A, Zhang L, Gotoh Y, Khan P & Amiruddin M (2013) The University of Sheffield, Harbin Engineering University and University of Engineering & Technology, Lahore at TRECVID 2013: Instance Search & Semantic indexing. TRECVID RIS download Bibtex download
  • Khan MUG, Bashir K, Shah AA, Zhang L, Gotoh Y, Khan PI & Amiruddin M (2013) The University of Sheffield, Harbin Engineering University and University of Engineering & Technology, Lahore at TRECVID 2013: Instance search & semantic indexing. 2013 TREC Video Retrieval Evaluation, TRECVID 2013 RIS download Bibtex download
  • Al Ghamdi M, Khan M, Zhang L & Gotoh Y (2012) The University of Sheffield and Harbin Engineering University at TRECVID 2012: Instance Search. TRECVID RIS download Bibtex download
  • Khan M, Zhang L & Gotoh Y (2011) Human focused video description. ICCV - VECTaR workshop. Barcelona RIS download Bibtex download
  • Khan M, Zhang L & Gotoh Y (2011) Towards coherent natural language description of video streams. ICCV - SIG workshop. Barcelona RIS download Bibtex download
  • Zhang L, Khan M & Gotoh Y (2011) Video scene classification based on natural language description. ICCV - ARTEMIS workshop. Barcelona RIS download Bibtex download
  • Chantamunee S & Gotoh Y (2010) Nearly-repetitive video synchonisation using nonlinear manifold embedding. ICASSP. Dallas RIS download Bibtex download
  • Chantamunee S & Gotoh Y (2008) University of Sheffield at TRECVID 2008: Rushes Summarisation and Video Copy Detection.. TRECVID RIS download Bibtex download
  • Chantamunee S & Gotoh Y (2008) Shot alignment in pre-production video. MLMI. Utrecht RIS download Bibtex download
  • Chantamunee S & Gotoh Y (2007) University of Sheffield at TRECVID 2007: Shot Boundary Detection and Rushes Summarisation.. TRECVID RIS download Bibtex download
  • Kolluru B & Gotoh Y (2007) Speaker Role Based Structural Classification of Broadcast News Stories. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 (pp 141-144) RIS download Bibtex download
  • Kolluru B & Gotoh Y (2007) Relative Evaluation of Informativeness in Machine Generated Summaries. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 (pp 145-148) RIS download Bibtex download
  • Kolluru B, Christensen H & Gotoh Y (2005) Mutli-stage compaction approach to broadcast news summarisation. Interspeech. Lisbon RIS download Bibtex download
  • Kolluru B & Gotoh Y (2005) On the subjectivity of human authored short summaries. ACL Workshop: Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarizati. Ann Arbor RIS download Bibtex download
  • Christensen H, Kolluru BK, Gotoh Y & Renals S (2005) Maximum entropy segmentation of broadcast news. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5 (pp 1029-1032) RIS download Bibtex download
  • Kolluru B, Christensen H & Gotoh Y (2004) Decremental feature-based compaction. DUC Workshop. Boston RIS download Bibtex download
  • Christensen H, Kolluru BK, Gotoh Y & Renals S (2004) From text summarisation to style-specific summarisation for broadcast news. ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, Vol. 2997 (pp 223-237) RIS download Bibtex download
  • Christensen H, Gotoh Y, Kolluru B & Renals S (2003) Are extractive text summarisation techniques portable to broadcast news?. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03 (pp 489-494) RIS download Bibtex download
  • Kolluru B, Christensen H, Gotoh Y & Renals S (2003) Exploring the style-technique interaction in extractive summarization of broadcast news. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03 (pp 495-500) RIS download Bibtex download
  • Gotoh Y & Renals S (2003) Statistical language modelling. TEXT- AND SPEECH-TRIGGERED INFORMATION ACCESS, Vol. 2705 (pp 78-105) RIS download Bibtex download
  • Christensen H, Gotoh Y & Renals S (2001) Punctuation Annotation Using Statistical Prosody Models. Proceedings of the ISCA Workshop on Prosody in Speech Recognition and Understanding (pp 35-40) RIS download Bibtex download
  • Gotoh Y & Renals S (2000) Sentence boundary detection in broadcast speech transcripts. ISCA ASR Workshop. Paris RIS download Bibtex download
  • Gotoh Y & Renals S (2000) Variable word rate n-grams. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI (pp 1591-1594) RIS download Bibtex download
  • Renals S & Gotoh Y (1999) Integrated transcription and identification of named entities in broadcast speech. Eurospeech. Budapest RIS download Bibtex download
  • Gotoh Y & Renals S (1999) Statistical annotation of named entities in spoken audio. ESCA Workshop: Accessing Information in Spoken Audio. Cambridge RIS download Bibtex download
  • Gotoh Y, Renals S & Williams G (1999) Named entity tagged language models. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI (pp 513-516) RIS download Bibtex download
  • Gotoh Y & Renals S (1997) Document space models using latent semantic analysis. Eurospeech. Rhodes RIS download Bibtex download
  • Adcock J, Gotoh Y, Mashao D & Silverman HF (1996) Microphone-array speech recognition via incremental MAP training.. ICASSP. Atlanta RIS download Bibtex download
  • Gotoh Y & Silverman HF (1996) Incremental ML estimation of HMM parameters for efficient training. ICASSP. Atlanta RIS download Bibtex download
  • Gotoh Y, Hochberg MM, Mashao D & Silverman HF (1995) Incremental MAP estimation of HMMs for efficient training and improved performance. ICASSP. Detroit RIS download Bibtex download
  • Gotoh Y, Hochberg MM & Silverman HF (1994) Using MAP estimated parameters to improve HMM speech recognition performance. ICASSP. Adelaide RIS download Bibtex download
  • Khan M, Al Harbi N & Gotoh Y () Natural language descriptions for video streams. V&L Net Workshop. Sheffield, December 2012. RIS download Bibtex download
  • Al Ghamdi M, Zhang L & Gotoh Y () Spatio-temporal SIFT and its application to human action classification. ECCV - VECTaR workshop. Firenze, October 2012. RIS download Bibtex download
  • Al Ghamdi M, Al Harbi N & Gotoh Y () Spatio-temporal video representation with locality-constrained linear coding. ECCV - ARTEMIS workshop. Firenze, October 2012. RIS download Bibtex download
  • Khan M, Zhang L & Gotoh Y () Generating coherent natural language annotations for video streams. ICIP. Orlando, September 2012. RIS download Bibtex download
  • Khan M & Gotoh Y () Natural language descriptions of visual scenes: corpus generation and analysis. EACL workshop. Avignon, April 2012. RIS download Bibtex download
  • Khan M & Gotoh Y () Describing video contents in natural language. EACL workshop. Avignon, April 2012. RIS download Bibtex download
  • Al Harbi N & Gotoh Y () Natural language descriptions for human activities in video streams. INLG17 proceedings, 4 September 2017 - 7 September 2017. View this article in WRRO RIS download Bibtex download

Working papers

  • Urban J, Hilaire X, Hopfgartner F, Villa R, Jose JM, Chantamunee S & Gotoh Y (2006) Glasgow University at TRECVID 2006. TRECVID 2006 - Text REtrieval Conference TRECVid Workshop, 363-367. RIS download Bibtex download
Grants

Current Grants

  • Multimedia Analysis for Unsupervised Dubbing In Entertainment (MAUDIE), InnovateUK, 04/2018 to 03/2021, £393,115, as Co-PI

Previous Grants

Professional activities

Member of the Speech and Hearing research group