Ciência-IUL
Publications
Publication Detailed Description
Transcribing and annotating speech corpora for speech recognition: A three-step crowdsourcing approach with quality control
Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2013
Year (definitive publication)
2013
Language
English
Country
United States of America
More Information
Web of Science®
This publication is not indexed in Web of Science®
Scopus
Google Scholar
Abstract
Large speech corpora with word-level transcriptions annotated for noises and disfluent speech are necessary for training automatic speech recognisers. Crowdsourcing is a lower-cost, faster-turnaround, highly scalable alternative for expert transcription and annotation. In this paper, we showcase our three-step crowdsourcing approach motivated by the importance of accurate transcriptions and annotations.
Acknowledgements
--
Keywords
Automatic speech recognition,Speech corpora,Transcription,Annotation,Crowdsourcing
Fields of Science and Technology Classification
- Computer and Information Sciences - Natural Sciences
- Electrical Engineering, Electronic Engineering, Information Engineering - Engineering and Technology
- Languages and Literature - Humanities
Contributions to the Sustainable Development Goals of the United Nations
With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência-IUL. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.