Semi-Supervised Information Extraction



    Abstract

    We are developing the automatic content extraction system for information agents. In order to reduce the cost of building requred resources for the automatic content extraction with minimal performance loss, we present a semi-supervised information extraction approach. We concentrate on improving not only the precision of the result, but also the coverage of the method.


  • Automatic Content Extraction
    • Extracting structured content-bases from natural language documents


    • A Semi-supervised Information Extraction Approach

    • Input : Natural Language Documents, Seed Instances
    • Output : Structured Content-bases, such as RDBs or Semantic Web Ontologies
    • Context Pattern Induction is based on the simple surface templates.
    • Candidate Instance Extraction is based on the sequence alignment model.
    • Content Assessment is based on the redundant-based assessment scheme.


  • Demo Video
    • You can download and play the demo video of the semi-supervised information extraction system for extracting an EPG content-base from here


  • Publications
    • Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee. Kwangil Ko, Zino Lee. An alignment-based approach to semi-supervised relation extraction including multiple arguments. Proceedings of the fourth Asian Information Retrieval Symposium (AIRS 2008), Harbin, Jan 2008 [PDF]
    • Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee. A semi-supervised method for efficient construction of statistical spoken language understanding resources. Proceedings of the Interspeech 2007-Eurospeech, Antwerp, Aug 2007 [PDF]








Research Page | Top | Home

 

San 31, Hyoja-Dong, Pohang, 790-784, Korea
Phone: +82 54 279 5581 | Fax: +82 54 279 2299