Automatic topic segmentation and labeling in multiparty dialogue
Proceedings of the the First IEEE/ACM Workshop on Spoken Language Technology (SLT)
Date
04/11/2010Author
Hsueh, Pei-Yun
Moore, Johanna D.
Metadata
Abstract
This study concerns how to segment a scenario-driven multiparty
dialogue and how to label these segments automatically.
We apply approaches that have been proposed for identifying
topic boundaries at a coarser level to the problem of
identifying agenda-based topic boundaries in scenario-based
meetings. We also develop conditional models to classify segments
into topic classes. Experiments in topic segmentation
show that a supervised classification approach that combines
lexical and conversational features outperforms the unsupervised
lexical chain-based approach, achieving 20% and 12%
improvement on segmentating top-level and sub-topic segments
respectively. Experiments in topic classification suggest
that it is possible to automatically categorize segments
into appropriate topic classes given only the transcripts. Training
with features selected using the Log Likelihood ratio improves
the results by 13.3%.