d) Working group (comment on the choice of the research group) RNDr. Michal Laclavík, PhD. - focuses on knowledge modeling, ontologies, semantic annotation and text and electronic
communication processing via semantic technologies Ing. Ladislav Hluchý, PhD. - his principal research interests are the semantic web and semantic grid; and knowledge oriented
technologies prof. Ing. Igor Mokriš, CSc. – his research topic is neural networks. Recently he has been dealing with information retrieval
from text documents through neural networks, while supervising a doctoral student in this field. Ing. Giang Nguyen, PhD. - text processing, knowledge presentation via knowledge modeling Ing. Zoltán Balogh - deals with knowledge technologies such as case based reasoning, semantic web services, service oriented
architectures and similarity measures in ontologies Ing. Marián Babík – his research areas are semantic web services, service oriented architectures and distributed knowledge
bases and memories. Ing. Radoslav Forgáč – Dealing with neural networks, he has submitted a doctoral thesis in this subject. Ing. Emil Gatial – conducts research in visualization and administration of knowledge and knowledge ontologies Mgr. Martin Šeleng – carries out research in XML technologies and their interconnection to ontology and semantics as well as
in stochastic and statistic modeling and statistic evaluation of information and knowledge relevance. Ing. Adrián Tóth – his research interest is knowledge technologies, semantic web services and service oriented architectures. Ing. Lenka Skovajsová – is a doctoral student, and her research topic is information retrieval from text document in a natural
e) Description of applied methods and their explanation
The work within the project will include design and analysis of neural network models for information retrieval from text documents. In this area the vector models were applied most frequently, because they can be implemented in the neural networks structure. From these kinds of models, the vector space model was applied mainly. It was used for the information retrieval from the text document collection by spreading activation neural networks. A great drawback of this system in a large document collection can be a large dimension of keyword-document matrix, which was seen in the structure complexity of neural network, mainly in the number of neurons in the layers of the neural networks.
It seems that this information retrieval method from large collections of data could be based on the latent semantic indexing model and on the models based on the principal component analysis and independent component analysis, which have the best prerequisite for the dimension reduction of the document space. It can be expected that the dimension reduction of the document space damages the addressing of an access to documents and by this manner disrupt the affectivity of the information retrieval algorithm. That is the reason why one needs to find suitable hybrid representation of documents. It enables their sorting and storing into domain databases of documents and also suitable representation to dimension reduction of information retrieval space.
In distributed organizational bases, the design will concentrate on semantic web standards such as RDF/RDFS, OWL, successfully applied in several application domains. The project will strive to analyze and extend ontologies and methodology of their development in terms of distributed systems. Moreover, the project will analyze semantic peer-to-peer technologies and their setting in the organization environment. Within explicit knowledge modeling applying rules and scripts, standards for rule description will be maintained. These include SWRL, RuleML etc.; and common script languages Python, PHP, Perl and Ruby.
Documents and electronic documentation will be processed with the semantic text annotation in an effect to enable defining a document and communication context to be compared to a user context and to provide relevant information and knowledge on the basis of found coincidence between these contexts described with the ontological model to be designed by methodologies such as CommonKADS.
The ontological approach can be applied in domain representation of a large document collection that would distribute document representation space to smaller domain areas, homogenously focused on given document areas, simplifying thus a structure of neural networks performing algorithms for document retrieval and access by means of key words, phrases on natural language contexts.
Results of algorithms which will run on applied and new developed models will be evaluated using statistic and stochastic methods. This will help us to see whether the results meet the expectations.