Ex information extraction system

General description

Ex is an IE system based on extraction ontologies, developed by the Knowledge Engineering Group (KEG) at UEP since 2006. Extraction ontologies aim to extract standalone named entities (standalone attributes) and instances (groups of attributes which "belong together"). The advantage of this technology is that it can utilize multiple sources of extraction knowledge which should lower the requirement for training data. Ex can be used for extraction from heavily structured (e.g. tabular) documents, semi-structured documents and also from free-text documents.

For a domain of interest, the user writes an extraction ontology. An extraction ontology is structurally similar a conventional domain ontology, however it reflects the way information is presented on the web rather than the inherent state of affairs, and is extended with extraction knowledge that can be used to identify the described objects in text. An extraction ontology can be viewed as a set of attribute definitions, class definitions and axiom definitions.

Development of Ex is ongoing; on this website we publish development snapshots when we feel the code is stable enough to be useful. The code is writen in Java.
Ex is distributed under the LGPL license.

More information on Ex is downloadable here (newer material listed first):

Distributions for download

(Ex requires Java 1.5 or higher)

Contact

Martin Labsky, labsky at seznam dot cz

Marek Nekvasil, marek.nekvasil at gmail dot com

Vojtech Svatek, svatek at vse dot cz

This research is supported by the EC projects MedIEQ and K-Space. medieq kspace

Last modified: April 19, 2012.