Speech and Multimedia Infomration Retrieval



    Abstract

    We have been developing the spoken document retrieval system based on speech transcription. Beacause raw audio-visual material is difficult to search and browse, usefulness of spoken document collections is limited. This system enables users to search for spoken documents as easily as they search for text. In this research, we have focused on out-of-vocabulary (OOV) problem. If out-of-vocabulary words are present in queries and the corpus, word-based system wil not be sufficient. For this problem, we consider phonetic-based approach.


  • System architecture


    • Speech Recognition for Spoken Docuemnt
      • Words not in a dictionary can not be recognized or can be replaced
      • Long audio files : sentence boundary detection
      • Vocabulary selection : reduce out-of-vocabulary rate
      • Adaptation : acoustic model (MAP, MLLR) and language model (general model + topic-based model)



    • Indexing & Search
      • Ad-hoc model (early google model)
      • Soft indexing (Chelba et al, 2005) : Position specific posterior lattices (PSPL)
      • SOFT-HIT : (document id, position, posterior probability, context)
      • Document relevance using soft hits : full document score is a weighted linear combination of N-gram scores



    • Out-of-vocabluary problem


      • OOV query : map to phonetic representation
      • Phonetic transcription : from word lattice using G2P
      • Search : Using dynamic matching



    • Contact




      Research Page | Top | Home

       

      San 31, Hyoja-Dong, Pohang, 790-784, Korea
      Phone: +82 54 279 5581 | Fax: +82 54 279 2299