Using conditional random fields to predict pitch accents in conversational speech

Michelle L. Gregory, Yasemin Altun

Research output: Contribution to journalConference articlepeer-review

44 Scopus citations

Abstract

The detection of prosodic characteristics is an important aspect of both speech synthesis and speech recognition. Correct placement of pitch accents aids in more natural sounding speech, while automatic detection of accents can contribute to better word-level recognition and better textual understanding. In this paper we investigate probabilistic, contextual, and phonological factors that influence pitch accent placement in natural, conversational speech in a sequence labeling setting. We introduce Conditional Random Fields (CRFs) to pitch accent prediction task in order to incorporate these factors efficiently in a sequence model. We demonstrate the usefulness and the incremental effect of these factors in a sequence model by performing experiments on hand labeled data from the Switchboard Corpus. Our model outperforms the baseline and previous models of pitch accent prediction on the Switchboard Corpus.

Original languageEnglish
Pages (from-to)677-683
Number of pages7
JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
StatePublished - 2004
Externally publishedYes
Event42nd Annual Meeting of the Association for Computational Linguistics, ACL 2004 - Barcelona, Spain
Duration: Jul 21 2004Jul 26 2004

Fingerprint

Dive into the research topics of 'Using conditional random fields to predict pitch accents in conversational speech'. Together they form a unique fingerprint.

Cite this