4418

iDocument: Using Ontologies for Extracting and Annotating Information from Unstructured text

Benjamin Adrian, Jörn Hees, Ludger van Elst, Andreas Dengel

KI 2009: Advances in Artificial Intelligence German Conference on Artificial Intelligence (KI-2009), September 15-18, Paderborn, Germany , Vol: 5803 , Pages: 249-256 , Springer-Verlag, Heidelberg , 2009
Due to the huge amount of text data in the WWW, annotating unstructured text with semantic markup is a crucial topic in Semantic Web research. This work formally analyzes the incorporation of domain ontologies into information extraction tasks in iDocument. Ontology-based information extraction exploits domain ontologies with formalized and structured domain knowledge for extracting domain-relevant information from un-annotated and unstructured text. iDocument provides a pipeline architecture, an extraction template interface and the ability of exchanging domain ontologies for performing information extraction tasks. This work outlines iDocument's ontology-based architecture, the use of SPARQL queries as extraction templates and an evaluation of iDocument in an automatic document annotation scenario.

Show BibTex:

@inproceedings {
       abstract = {Due to the huge amount of text data in the WWW, annotating unstructured text with semantic markup is a crucial topic in Semantic Web research. This work formally analyzes the incorporation of domain ontologies into information extraction tasks in iDocument. Ontology-based information extraction exploits domain ontologies with formalized and structured domain knowledge for extracting domain-relevant information from un-annotated and unstructured text. iDocument provides a pipeline architecture, an extraction template interface and the ability of exchanging domain ontologies for performing information extraction tasks. This work outlines iDocument's ontology-based architecture, the use of SPARQL queries as extraction templates and an evaluation of iDocument in an automatic document annotation scenario.},
       number = {}, 
       month = {9}, 
       year = {2009}, 
       title = {iDocument: Using Ontologies for Extracting and Annotating Information from Unstructured text}, 
       journal = {}, 
       volume = {5803}, 
       pages = {249-256}, 
       publisher = {Springer-Verlag, Heidelberg}, 
       author = {Benjamin Adrian, Jörn Hees, Ludger van Elst, Andreas Dengel}, 
       keywords = {Information Extraction, Ontology, Semantic Web, Semantic Annotation, Natural Language Processing},
       url = {}
}