The authors present
a general architecture for a domain and language specific application for the
concise storage and presentation of the information retrieved from a wide
spectrum of web-based information sources. The proposed architecture was
influenced by particular challenges of knowledge intensive domain, gathering
relevant documents from the internet, mining the knowledge content of unstructured
textual information, demands for context driven, multi-faceted, up-to-date
query and presentation of required information, furthermore by intricacies of
the Hungarian language, calling for special solutions to a number of linguistic
problems.
The proposed system is developed in the framework of the Information and Knowledge Fusion international EUREKA project, that globally aims at the design and implementation of new intelligent knowledge warehousing environments, which would allow advanced knowledge management in various application domains [1]. The Hungarian IKF project (IKF-H) concentrates on developing a financial advisory application that transforms data from heterogeneous and unstructured Hungarian language information sources into an integrated internal knowledge repository. This repository would serve as a decision support system for Hungarian banks and revenue services.
In order to surpass the performance of a typical information retrieval system, the process of a human information retrieval is being studied and – at least partially – followed. Even the shallowest analysis of the human performance shows that its advantage consists mainly of (1) the use of linguistic competence and (2) the benefits of background knowledge. Since linguistic techniques are rapidly being added to implemented information retrieval systems, the construction of mapping and incorporation of background knowledge becomes the biggest challenge. A suitable and efficient solution for representing part of the (human’s) background knowledge is the use of ontologies [2]. One of the main goals of the IKF information retrieval system is to create a well-defined ontology that can be integrated with several document analysis techniques (indexing and searching, linguistic parser, etc.) to increase the performance of the whole information retrieval and extraction process.
Another way to
utilise useful background knowledge in the retrieval system is to model some
aspects of the searching and extraction mechanism of a human. The proposed
system contains an autonomous document retrieval and analyser subsystem that
involves efficient document searching and information extraction techniques
with a source model based approach [3].
In the framework of
the Hungarian IKF project, a prototype system is under development in order to
implement our ideas in a real-world application.
[1] EUREKA PROJECT
“IKF - Information and Knowledge Fusion”, March 2000.
[2] N. Guarino,
“Formal Ontology in Information Systems,” In N.Guarino (ed.) Formal Ontology in
Information Systems. Proceedings of FOIS'98, Trento, Italy, 6-8 June 1998. IOS
Press, Amsterdam: 3-15.
[3] P. Varga, T.
Mészáros, Cs. Dezsényi, T.P. Dobrowiecki, “An Ontology-based Information
Retrieval System”, The 16th International Conference on Industrial &
Engineering Applications of Artificial Intelligence and Expert Systems,
Loughborough, UK, June 23-26, 2003.