Show simple item record

dc.contributor.authorAbberley, Dave
dc.contributor.authorRenals, Steve
dc.contributor.authorCook, Gary
dc.contributor.authorRobinson, Tony
dc.coverage.spatial6en
dc.date.accessioned2006-05-15T12:08:40Z
dc.date.available2006-05-15T12:08:40Z
dc.date.issued1998
dc.identifier.citationNIST Special Publication 500-240: The Sixth Text REtrieval Conference (TREC-6). pp.747-752.en
dc.identifier.urihttp://trec.nist.gov/pubs/trec6/t6_proceedings.html
dc.identifier.urihttp://hdl.handle.net/1842/1070
dc.description.abstractThe THISL spoken document retrieval system is based on the Abbot Large Vocabulary Continuous Speech Recognition (LVCSR) system developed by Cambridge University, Sheffield University and SoftSound, and uses PRISE (NIST) for indexing and retrieval. We participated in full SDR mode. Our approach was to transcribe the spoken documents at the word level using Abbot, indexing the resulting text transcriptions using PRISE. The LVCSR system uses a recurrent network-based acoustic model (with no adaptation to different conditions) trained on the 50 hour Broadcast News training set, a 65,000 word vocabulary and a trigram language model derived from Broadcast News text. Words in queries which were out-of-vocabulary (OOV) were word spotted at query time (utilizing the posterior phone probabilities output by the acoustic model), added to the transcriptions of the relevant documents and the collection was then re-indexed. We generated pronunciations at run-time for OOV words using the Festival TTS system (University of Edinburgh).en
dc.format.extent48798 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherDepartment of Commerce, National Institute of Standards and Technologyen
dc.titleThe THISL Spoken Document Retrieval Systemen
dc.typeConference Paperen


Files in this item

This item appears in the following Collection(s)

Show simple item record