|
Theme Extraction
Creating key relationships between documents
At the core of the Active Navigation solution is a rich underlying model of key relationships between the pieces of unstructured information within the system.
This model is created by a process of extracting the themes or concepts that best represent what the specific pieces of information are discussing. This is achieved using Active Navigation's Link Editor or Offline Themer products. The extracted themes then underpin all the Active Navigation features including search, Dynamic Link Injection and Summarization.
In order to avoid the problem exhibited by full text search engines, the problem of too many irrelevant results, theme extraction does not extract every word in the document. Active Navigation instead focuses on Natural Language Processing techniques rather than a simple pattern matching system, word level string comparison or n-gram counting. The extraction process uses linguistic knowledge such as morphology, syntax and semantics. It performs a contextual analysis to ensure that contextual sense can be made of content, ensuring that themes have meaning, rather than just being words in a document. This ensures that concepts such as:
150 year old wine
antibiotic resistance marker genes
motte and bailey castle
are extracted as a theme or concept. Also it ensures that;
Labrador Tourism
has more context than
Labrador
which can be a breed of dog rather than a region of Canada.
Although great emphasis is placed on extracting phrases, the process is not reliant on a pre-defined list of names and places or thesauri to extract its themes.
Active Navigation believes in allowing editorial or expert involvement in determining what is provided to the end users of the system. If the information provider wishes to intervene in the process of theme extraction then this is easily achieved.
The Active Navigation system is capable of operating in multiple languages. Please contact us for language availability. Active Navigation have also provided unique theme extraction processes for specific business problems.
|