Publication in conference proceedings
Transcribing and annotating speech corpora for speech recognition: A three-step crowdsourcing approach with quality control
Annika Hämäläinen (Hämäläinen, A.); Fernando Pinto Moreira (Moreira, F. P.); Jairo Avelar (Avelar, J.); Daniela Braga (Braga, D.); Miguel Sales Dias (Dias, M. S.);
Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2013
Year (definitive publication)
2013
Language
English
Country
United States of America
More Information
Web of Science®

This publication is not indexed in Web of Science®

Scopus

Times Cited: 2

(Last checked: 2024-11-15 08:37)

View record in Scopus

Google Scholar

Times Cited: 4

(Last checked: 2024-11-17 09:28)

View record in Google Scholar

Abstract
Large speech corpora with word-level transcriptions annotated for noises and disfluent speech are necessary for training automatic speech recognisers. Crowdsourcing is a lower-cost, faster-turnaround, highly scalable alternative for expert transcription and annotation. In this paper, we showcase our three-step crowdsourcing approach motivated by the importance of accurate transcriptions and annotations.
Acknowledgements
--
Keywords
Automatic speech recognition,Speech corpora,Transcription,Annotation,Crowdsourcing
  • Computer and Information Sciences - Natural Sciences
  • Electrical Engineering, Electronic Engineering, Information Engineering - Engineering and Technology
  • Languages and Literature - Humanities

With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência-IUL. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.