Spoken Language Understanding
One of the main issues on natural language processing is to understand intention from user utterance. We have been developing Spoken Language Understanding (SLU) techniques for speech dialogue system which assures sufficient robustness and flexibility. We present a new language model adaptation method to recover in-complete inputs from speech. Several machine learning methods have been applied for our supervised/unsupervised SLU systems to extract semantic concept from users’ utterances. We also have developed several techniques consists of linguistic feature process, information extraction, relational data learning to construct SLU techniques.
- Natural Language Understanding
- Main Goal: The main goal of Natural Language Understanding unit is to extract information from user utterance and to construct semantic frame
- Architecture
- Feature Extraction
- Syntactic Structure based Feature : Parse Tree Kernel, Parse Path, Predicate word
- Semantic Word based Feature : Unigram Feature, Bigram Feature, Word-Pair Feature
- Feature based Prediction
- The system predicts semantic frames based on pre-built model. Models can be constructed by Probabilistic Network (CRF/Maximum Entropy Model) , Structured Support Vector Machine (SSVM) or their ensembles.
- Semantic Frame
- Semantic Frame consists of user intention information: Dialog Act and Named Entities.
- Dialog Act: The action which user anticipate for system to achieve.
- Named Entities: The values which are required for Dialog Management system to deal with users' inquiries.
- Robust Speech Understanding
- Error-Corrective Language Model Adaptation is an adaptation
framework with error handling method to improve accuracy of speech
recognition and performance of spoken language applications. The proposed
error corrective language model adaptation approach exploits domain-specific
language variations and recognition environment characteristics to provide
robustness and adaptability for a spoken language system.
- Adaptation framework
- Channel Modeling with Fertility : Capturing channel characteristics with
word-to-word substitution model and word fertility model, MLE for
substitution model and Several discounting methods (absolute, Good-Turing,
Kneser-Ney)
- Exploiting Linguistic Knowledge in Language Model Adaptation using
whole-sentence exponential models (including MCMC sampling, Gaussian prior
smoothing)
- ECLM adaptation is MAP process:
- Semantic Frame Extractor is to analyze the output of the speech
recognition component and to assign a meaning representation that can be
used by the dialogue manager. In many current spoken dialogue systems, the
meaning of the utterance is derived directly from the recognized string
using a “semantic frame”.
- Procedure
- Input : Utterances
- Output: Meaningful concepts
- Linguistic feature generator selects useful syntactic, semantic,
cognitive features based on POS tagging, chunking, and parsing.
- Machine learning techniques train models to represent linguistic
character based Maximum Entropy models, Conditional Random Fields, and
Neural Networks.
- Main goal
- Clearing ambiguity in natural language (Lexical/Sense/Structural
ambiguity)
- Robust handling for ill-formed spoken input
- Unsupervised Spoken Language Understanding
- Dialog Corpus
- Essential resource for the data-driven spoken dialog system
- Should be labeled with a semantic annotation
- Human annotation process
- Require tedious human efforts
- Goal
- We address the SLU problem in an unsupervised manner
- Overall procedure
- Contact