Content-based access to spoken audio
dc.contributor.author
Koumpis, Konstantinos
en
dc.contributor.author
Renals, Steve
en
dc.coverage.spatial
24
en
dc.date.accessioned
2006-05-09T11:48:05Z
dc.date.available
2006-05-09T11:48:05Z
dc.date.issued
2005
dc.description.abstract
The amount of archived audio material in digital form is increasing rapidly, as
advantage is taken of the growth in available storage and processing power.
Computational resources are becoming less of a bottleneck to digitally record and
archive vast amounts of spoken material, both television and radio broadcasts and
individual conversations. However, listening to this ever-growing amount of spoken
audio sequentially is too slow, and the bottleneck will become the development of
effective ways to access content in these voluminous archives. The provision of
accurate and efficient computer-mediated content access is a challenging task,
because spoken audio combines information from multiple levels (phonetic, acoustic,
syntactic, semantic and discourse). Most systems that assist humans in accessing
spoken audio content have approached the problem by performing automatic speech
recognition, followed by text-based information access. These systems have
addressed diverse tasks including indexing and retrieving voicemail messages,
searching for broadcast news, and extracting information from recordings of
meetings and lectures. Spoken audio content is far richer than what a simple textual
transcription can capture as it has additional cues that disclose the intended meaning
and speaker’s emotional state. However, the text transcription alone still provides a
great deal of useful information in applications.
This article describes approaches to content-based access to spoken audio with a
qualitative and tutorial emphasis. We describe how the analysis, retrieval and
delivery phases contribute making spoken audio content more accessible, and we
outline a number of outstanding research issues. We also discuss the main
application domains and try to identify important issues for future developments. The
structure of the article is based on general system architecture for content-based access which is depicted in Figure 1. Although the tasks within each processing
stage may appear unconnected, the interdependencies and the sequence with which
they take place vary.
en
dc.format.extent
512353 bytes
en
dc.format.mimetype
application/pdf
en
dc.identifier.citation
IEEE Signal Processing Magazine, 22(5), September 2005.
dc.identifier.uri
http://hdl.handle.net/1842/935
dc.language.iso
en
dc.publisher
IEEE
en
dc.title
Content-based access to spoken audio
en
dc.type
Article
en
Files
Original bundle
1 - 1 of 1
- Name:
- koumpis-spm05.pdf
- Size:
- 500.34 KB
- Format:
- Adobe Portable Document Format
This item appears in the following Collection(s)

