Transcribing and annotating speech corpora for speech recognition: A three-step crowdsourcing approach with quality control

Annika Hämäläinen; Fernando Pinto Moreira; Jairo Avelar; Daniela Braga; Miguel Sales Dias

Ciência_Iscte Publications Publication Detailed Description

Publication in conference proceedings

Transcribing and annotating speech corpora for speech recognition: A three-step crowdsourcing approach with quality control

Annika Hämäläinen (Hämäläinen, A.); Fernando Pinto Moreira (Moreira, F. P.); Jairo Avelar (Avelar, J.); Daniela Braga (Braga, D.); Miguel Sales Dias (Dias, M. S.);

Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2013

Year (definitive publication)

2013

Language

English

Country

United States of America

More Information

Visit Link

Web of Science®

This publication is not indexed in Web of Science®

Scopus

Times Cited: 2

(Last checked: 2026-06-08 13:22)

View record in Scopus

Google Scholar

Times Cited: 6

(Last checked: 2026-06-02 16:06)

View record in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

Large speech corpora with word-level transcriptions annotated for noises and disfluent speech are necessary for training automatic speech recognisers. Crowdsourcing is a lower-cost, faster-turnaround, highly scalable alternative for expert transcription and annotation. In this paper, we showcase our three-step crowdsourcing approach motivated by the importance of accurate transcriptions and annotations.

Acknowledgements

Keywords

Automatic speech recognition,Speech corpora,Transcription,Annotation,Crowdsourcing

Fields of Science and Technology Classification

Computer and Information Sciences - Natural Sciences
Electrical Engineering, Electronic Engineering, Information Engineering - Engineering and Technology
Languages and Literature - Humanities

Contributions to the Sustainable Development Goals of the United Nations

With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência_Iscte. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.

Publication Identifiers

Scopus (source: Ciência_Iscte)	2-s2.0-84899503030
DOI (source: author)	10.1609/hcomp.v1i1.13102
Scopus (source: author)	2-s2.0-84899503030
Ciência_Iscte ID	ci-pub-16895
Handle (source: Ciência-IUL)	http://hdl.handle.net/10071/27869

Other Publication Details

Online Publication Year	2013
Publisher	AAAI Press
Indexes	Scopus;
ISSN	--
ISBN	978-1-57735-607-3 (print)
Volume	WS-13-18
Article Number
Pages	30 - 31	Total Pages	2
Peer Reviewed	Yes
Dissemination Mean	Both (printed and digital)
Editors	Hartmann, B., and Horvitz, E.
Event Title	1st AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2013
Event Organizer	Association for the Advancement of Artificial Intelligence
City	Palm Springs, California, USA
Event Type	Conference
Event Classification	International
Event Year	2013
Event Publication Type	Position Paper
ISCTE-IUL Repository	Link to the repository
Publication Date (online)
Publication Date (print)

Altmetric

Dimensions

PlumX Metrics