The linguistic engine EXTRAKT is a complete set of functions for natural language processing (NLP). It is the basis for monolingual or multilingual (cross-lingual) applications in different domains, such as indexing, lemmatisation, language identification, text classification etc.
In most cases, EXTRAKT is used as an add-on for the handling of search requests in Internet search engines, library systems or shop systems.
EXTRAKT's programming and development began in 1990. It was firstly put to use as the German component of the multi-lingual full text retrieval system EMIR (European Multilingual Information
Retrieval), which was probably the first multilingual full text retrieval system worldwide.
EXTRAKT's main component are dictionaries, that means that most of the information is stored in dictionaries. The access to the linguistic information is very fast. Therefore, EXTRAKT is working at high speed and even huge amount of documents are no problem for it.
Within the above mentionned EMIR project, the decision was taken, to use only full form dictionaries - a rather uncommon decision at this time - but it was the right decision (due to Prof. Christian Fluhr from the French CEA): a single dictionary look-up returns the desired information out of a given dictionary.