Ciência_Iscte Publicações Descrição Detalhada da Publicação Exportar

Exportar Publicação

A publicação pode ser exportada nos seguintes formatos: referência da APA (American Psychological Association), referência do IEEE (Institute of Electrical and Electronics Engineers), BibTeX e RIS.

Exportar Referência (APA)

Batista, F., Moniz, H., Trancoso, I., Mamede, N. & Mata, A. I. (2012). Extending automatic transcripts in a unified data representation towards a prosodic-based metadata annotation and evaluation. Journal of Speech Sciences. 2 (2), 115-138

Exportar Referência (IEEE)

F. M. Batista et al.,  "Extending automatic transcripts in a unified data representation towards a prosodic-based metadata annotation and evaluation", in Journal of Speech Sciences, vol. 2, no. 2, pp. 115-138, 2012

Exportar BibTeX

@article{batista2012_1782822791122,
	author = "Batista, F. and Moniz, H. and Trancoso, I. and Mamede, N. and Mata, A. I.",
	title = "Extending automatic transcripts in a unified data representation towards a prosodic-based metadata annotation and evaluation",
	journal = "Journal of Speech Sciences",
	year = "2012",
	volume = "2",
	number = "2",
	pages = "115-138",
	url = "http://www.journalofspeechsciences.org/"
}

Exportar RIS

TY - JOUR
TI - Extending automatic transcripts in a unified data representation towards a prosodic-based metadata annotation and evaluation
T2 - Journal of Speech Sciences
VL - 2
IS - 2
AU - Batista, F.
AU - Moniz, H.
AU - Trancoso, I.
AU - Mamede, N.
AU - Mata, A. I.
PY - 2012
SP - 115-138
SN - 2236-9740
UR - http://www.journalofspeechsciences.org/
AB - This paper describes a framework that extends automatic speech transcripts in order to accommodate relevant information coming from manual transcripts, the speech signal itself, and other resources, like lexica. The proposed framework automatically collects, relates, computes, and stores all relevant information together in a self-contained data source, making it possible to easily provide a wide range of interconnected information suitable for speech analysis, training, and evaluating a number of automatic speech processing tasks. The main goal of this framework is to integrate different linguistic and paralinguistic layers of knowledge for a more complete view of their representation and interactions in several domains and languages. The processing chain is composed of two main stages, where the first consists of integrating the relevant manual annotations in the speech recognition data,
and the second consists of further enriching the previous output in order to accommodate prosodic information. The described framework has been used for the identification and analysis of structural metadata in automatic speech transcripts. Initially put to use for automatic detection of punctuation marks and for capitalization recovery from speech data, it has also been recently used for studying the characterization of disfluencies in speech. It was already applied to several domains of Portuguese corpora, and also to English and Spanish Broadcast News corpora.
ER -