SILK: alkalmazás- integrálás logikai alapokon


Szeredi Péter <szeredi@iqsoft.hu>

IQSOFT Rt.

Benkõ Tamás <benko@iqsoft.hu>

IQSOFT Rt.

Krauth Péter <pkrauth@kfki.com>

KFKI Számítástechnikai Rt.



Introduction

SILK (System Integration via Logic and Knowledge, IST-1999-11135) is an EU IST Framework 5 project, with the goal of developing methods and tools for application integration, more specifically for the integration of heterogeneous information systems. Partners in the project are The National Institute for Research and Development in Informatics (Romania), Industrial Development and Education Centre (Greece), IQSOFT Intelligent Software Co. Ltd. (Hungary, co-ordinator), and Matra Systèmes & Information (France).

The project started in January 2000 and will be completed in September 2002.

Objectives

The objective of the SILK project is to develop a knowledge management tool-set for software system integration tasks. The SILK tool-set contains tools supporting both the linking of heterogeneous information sources (mediation) and their transformation into a more homogenous form (integration).

The project aims at proving the applicability of logic programming and intelligent agent technology as the implementation paradigms for the knowledge management tool-set. The project will also prove the practical relevance of the approach by developing several prototype applications, in both SME and large enterprise environments.

Further specific objectives include extensibility, use of emerging standards, support for technically untrained users, and integration with the business processes of the enterprise.

Development

The development of the SILK tool-set is driven by the requirements of the end-users on the application side and by the state-of-the-art techniques from the field of knowledge management on the technology side.

Following the requirements analysis phase, the agent oriented architecture and interfaces of the SILK tool-set are designed. Work then proceeds by developing the SILK meta- and query-languages, suitable for describing the databases and applications, both on abstract and technical levels.

The tool-set proper consists of two parts. The Mediator tools support the coupling of heterogeneous information sources so that complex queries involving various databases/applications can be posed and answered. The task of the Integrator tools is the analysis of the information assets of the enterprise and to provide help for the transformation of these to a more homogeneous form.

The implementation of the tools is based on logic programming, agents and constraint techniques, ensuring high level knowledge representation, declarative interface descriptions and learning capabilities. This positions our solution above competing approaches by supporting semantic integration through generic modelling of the information assets.

Special emphasis is put on extensibility of the tool-set, by supporting standard knowledge sharing formats, access to external knowledge, addition of new information sources and third party tools.

Overview

The SILK tool set

The SILK tool-set provides a uniform interface for accessing diverse information sources, such as relational and object-oriented data bases, semi-structured data stored e.g. as XML documents, as well as data available through procedural queries.

The SILK tool-set supports the integration of several independently developed information sources. This integration process involves building a united, global model of the application areas in question, and mapping the entities of this model to those of the individual data source models.

The SILK tools-set makes it possible for the user to formulate queries in his or her own terms, by using the concepts described in the global model, and relying on properties of this model. This way composite queries, spanning several information sources, can be posed, which would not be possible otherwise.

When independent information sources are integrated, redundancy and synchronisation issues arise. The SILK tool-set supports the identification of the redundant data sources, of inconsistent or incompatible data, and helps in the process of synchronisation of these data sources. A better solution of such problems is to get rid of redundancy by restructuring the integrated system. The SILK system supports the restructuring activity, providing help in the design of a better integrated information system structure.

The structure of the SILK system

The main components of the SILK system and its user roles are shown in Figure 1.

Figure 3: Context diagram of a SILK integrated application system

The main components of SILK are the Mediator, the Integrator and the Wrapper. They communicate either by directly invoking each other, or through the SILK Knowledge Base (KB).

The SILK KB contains models, both on conceptual and information source level, and mappings between the models. The mappings, together with the constraints associated with them, describe the links and relationships between the corresponding entities of the models.

The Mediator provides the means for the transformation of queries formulated in terms of user-specific or global models into queries of the underlying individual information sources. This component also caters for query planning and optimisation.

The Integrator tools support the creation of a better integrated overall information system. They provide services for importing, editing and creating models, for building the mappings, for synchronisation and restructuring.

The Wrapper component caters for unifying diverse information sources. It provides an interface for querying the information sources as well as an interface for obtaining (meta-) information about the information sources.

The table below describes the main characteristics of the user roles shown in Figure 1.


Role

Tasks
(related to SILK application system)

Benefits
(of using SILK technology)

Knowledge User

The ~ is the end-user of the SILK application system, e.g. managers of an enterprise.

The ~ queries various pieces of information contained in the distributed sources in aggregate, as if it were stored in one complex information system.

The ~ specifies domain/enterprise/user specific concepts.

Facility of “mediation” for business appraisal purposes – through high-level uniform access to heterogeneous content.

IS User

The ~ is not involved directly in the integration process, and in the use of a SILK integrated application system.

The ~ uses the existing/legacy systems which are part of the evolving integrated system (SILK application system).

The ~ corrects erroneous data in existing systems.

Using improved information systems – with modern implementation techniques and valid content of the subsystems.

System Engineer

The ~ carries out the required structural modifications in the existing/legacy information systems.

The ~ enhances existing systems to allow for synchronisation.

Comprehensive and well structured view of the system – with relevant information about the current situation and about the future system needs. Practical help in improving interoperability.

Knowledge Engineer

Responsible for the integration /evolution process using SILK technology.

The ~ analyses the information sources and the SILK Knowledge Base.

The ~ maintains the SILK Knowledge Base.

Equipped with a set of useful tools to manage the use of enterprise wide knowledge from the tacit concepts down to hard data in ISs.

The technology used within the SILK project

One of the most crucial decisions made in the first phase of the project has been the selection of the main technologies used in the implementation of SILK.

We have chosen the object-oriented approach as the basis of modelling. Both the wrapper interface and the basic model storage format in SILK is object oriented. This is in line with the main trends of the software industry, and allows easy import/export of models from/to industrial modelling tools, such as Rational Rose.

The second main technology used in SILK, as implied by the full name of the project, is logic programming. We use the main logic programming language, Prolog, and its constraint extensions (CHR — Constraint Handling Rules, CLP(R) — constraint reasoning on the domain of Real numbers, etc.), for implementing the Mediator and Integrator components. However, the knowledge engineer is not forced to use Prolog or CHR, instead he/she can describe the relationships in the Object Constraint Language (OCL, [1]) which is closely related to the Unified Modelling Language standard.

The third main technology used in SILK is agent technology. Correspondingly, the Wrapper components are implemented in Java. The communication between the Prolog and Java components is performed using Jasper, the Prolog/Java interface of SICStus Prolog [2], the logic programming system used in SILK.

The application of SILK technology

The application of SILK technology can be characterised by identifying the context which calls for specific integration technologies, by describing the process of integration, i.e. how the SILK tool set is utilised, and by identifying the primary benefits of using this technology.

The context

The context which may require the application of SILK technology, can be described with the following two assumptions:

  1. The Integration Assumption: There are stakeholder enterprises, which would like to integrate, in a way, their existing information systems without reimplementing these systems. There can be more than one enterprise involved in such an integration activity.

Justification: Today and increasingly in the future, it is recognised that there is a need for creating inter-enterprise systems. This is a direct consequence of the trends to implement business systems using open, widely used networking technology (Internet), i.e. to develop electronic business (e-business) systems. These systems are not necessarily constrained to a single enterprise; instead, enterprises typically form coalitions or alliances to pursue common business objectives, and to develop e-business systems to support the realisation of these objectives.

  1. The Inconsistency Assumption: The participating enterprises own information sources (usually embedded in their information systems to be integrated). The content of these information sources is typically overlapping and inconsistent with each other, whereas the structure representing overlapping and/or interrelated information is incomplete or even contradictory.

Justification: It is very important to note that situations with integration problems cannot be easily avoided in the long run, and it is better to consider them as natural consequences of limited resources, changing business needs and evolving technologies. Although, there will be always large scale efforts to integrate disparate information sources, however, it is also expected that these efforts will be always seriously constrained by available resources and allowable time, and will result in only partially integrated systems in enterprises. This phenomenon is further emphasised by the recent tendency in needs for developing inter-enterprise systems, which increases the probability of incompatible information systems to be integrated. The uncontrolled redundancy in content and the unfitting structures across information systems are the main sources of integration problems. These problems are just further amplified by the different technologies used to implement these systems (different types of databases, programming languages, and standards). Enterprises cannot solve their system integration problem once and for all, on the contrary, such problems will always appear in different forms, and will challenge the enterprises.

The process

If the context is valid, the SILK tool set can be used for supporting an integration process but not for replacing it. The table below describes the main activities of such a process:


Activity

Subactivities

1.

Identify enterprises, business domains, user bases and information sources (IS) to integrate.

2.

Create/refine the SILK knowledge base

Define the ISs, their connections in the SILK knowledge base.

Define domain/enterprise/user concepts, categorisation and associations in the SILK knowledge base

3.

Identify and prepare a set of SILK tools to form a SILK application

4.

Analyse ISs for data consistency and redundancy

5.

Prepare ISs for integration

Clean up data if appropriate

Prepare for synchronisation

Identify owners for redundant data

6.

Operate the SILK application system

Define and execute user queries

Synchronise changes to redundant data

7.

Evaluate the results of the operation

Identify missing concepts or concepts to refine

Identify an existing, or develop a new, IS1

Restructure an existing, or develop a new, IS

The benefits

The use of SILK technology provides benefits for enterprises, which would like to improve the degree of integration of their information systems, in the following ways:

  1. It provides a cost effective and relatively fast approach to bridge the interoperability gap between information systems by

  1. It supports evolution of information systems by

The technical direction of the SILK project

The Knowledge Base

The Knowledge Base of the SILK system is a hierarchical structure of models and mappings between these models (see Figure 2).


Figure 3: The structure of the SILK Knowledge Base

The Knowledge Base is represented as a set of facts and rules in the SILK Knowledge Representation Language: SILan. The capabilities of SILan are close to the Unified Modelling Language (UML) enhanced with OCL (Object Constraint Language) and with some other knowledge constructs required for the functioning of the SILK system (e.g. mappings, derivation and construction rules). In fact, UML models, exported in XMI (XML Metadata Interchange) format can subsequently be transformed to SILan and thus imported to SILK.

A model can be positioned in one of the three layers of the Knowledge Base:

A description of a model provides three views of the modelled world, and correspondingly contains three types of information:

Two kinds of models are distinguished with respect to their roles in the integration process: local and global models. Local models refer to the view of a specific user group within the enterprise, describing their concepts and the information systems used by them. There can also be local models describing a technical domain used in the enterprise, or the organisation and management of the enterprise. The aim of the integration process supported by SILK is to provide a uniform view of the enterprise. This goal is achieved through the creation of one or more global conceptual models, which provide a unified view of all or some of the local conceptual models. Analogously, a global IS model brings together some or all logical IS models, and forms the basis for the restructuring of the information sources.

There are two other main types of constructs in the Knowledge Base:

A very important aspect of SILK is the presence of the mappings between the models. A mapping is set of links or relationships linking corresponding model elements (objects, attributes, operations etc.). The elements linked by the mapping may not be identical, in such a case the mapping is annotated with a constraint, which describes the transformation between the linked objects. Mappings represent a crucial part of the SILK technology, they are the basis of mediation as well as of the integration process.

The need for control rules has been identified for three different purposes:

High level architecture

The SILK system has three main subsystems (see Figure 3):

Figure 3: The architecture of the SILK system

The Status of the SILK project

In November 2000, the SILK project had its first review meeting, where a proof of concept demo was successfully presented. This demo used small problems to demonstrate the viability of our approach. We are currently working on the refinement of the design of the architecture and on the design and implementation of better methods for data independent verification of models, model analysis, and querying.

Conclusions

Good progress has been made in the first year of the SILK project. We have implemented several proof-of-concept prototype tools and successfully demonstrated these at the review meeting. We have explored the application environments for SILK, collected and analysed the requirements of the end users. We are currently carrying out a detailed design of the SILK tool set, the first implementation of which is due by the end of year 2001.

References

  1. UML (OMG Unified Modeling Language Specification, version 1.3, June 1999 http://www.omg.org/uml/)

  2. SICStus Prolog version 3.8.5 manual (http://www.sics.se/sicstus.html)

  3. The OIL language (OnToKnowledge, IST Project IST – 1999 – 10132 On – To – Knowledge http://www.ontoknowledge.org/oil/index.shtml)

1 Developing a new IS is outside of the SILK integration process. However, SILK tools (integrator) may support this activity by providing a first cut model for the new IS.

9