Content-based access to spoken audio

Koumpis, Konstantinos; Renals, Steve

Content-based access to spoken audio

Simple item page

dc.contributor.author

Koumpis, Konstantinos

en

dc.contributor.author

Renals, Steve

en

dc.coverage.spatial

24

en

dc.date.accessioned

2006-05-09T11:48:05Z

dc.date.available

2006-05-09T11:48:05Z

dc.date.issued

2005

dc.description.abstract

The amount of archived audio material in digital form is increasing rapidly, as advantage is taken of the growth in available storage and processing power. Computational resources are becoming less of a bottleneck to digitally record and archive vast amounts of spoken material, both television and radio broadcasts and individual conversations. However, listening to this ever-growing amount of spoken audio sequentially is too slow, and the bottleneck will become the development of effective ways to access content in these voluminous archives. The provision of accurate and efficient computer-mediated content access is a challenging task, because spoken audio combines information from multiple levels (phonetic, acoustic, syntactic, semantic and discourse). Most systems that assist humans in accessing spoken audio content have approached the problem by performing automatic speech recognition, followed by text-based information access. These systems have addressed diverse tasks including indexing and retrieving voicemail messages, searching for broadcast news, and extracting information from recordings of meetings and lectures. Spoken audio content is far richer than what a simple textual transcription can capture as it has additional cues that disclose the intended meaning and speaker’s emotional state. However, the text transcription alone still provides a great deal of useful information in applications. This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also discuss the main application domains and try to identify important issues for future developments. The structure of the article is based on general system architecture for content-based access which is depicted in Figure 1. Although the tasks within each processing stage may appear unconnected, the interdependencies and the sequence with which they take place vary.

en

dc.format.extent

512353 bytes

en

dc.format.mimetype

application/pdf

en

dc.identifier.citation

IEEE Signal Processing Magazine, 22(5), September 2005.

dc.identifier.uri

http://hdl.handle.net/1842/935

dc.language.iso

en

dc.publisher

IEEE

en

dc.title

Content-based access to spoken audio

en

dc.type

Article

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: koumpis-spm05.pdf
Size:: 500.34 KB
Format:: Adobe Portable Document Format

Download

This item appears in the following Collection(s)

CSTR publications