Informatics and Applications

2014, Volume 8, Issue 2, pp 130-144


  • L.A. Kuznetsov


The paper outlines the technology used to determine the degree of similarity of information objects, which are represented by text or graphic images. Objects are formalized by probabilistic models. The structure of the model is set by an algebra on a minimum set of graphic components of an object. Quantitative characteristics of the structure of objects are the probability distributions on the algebra. The amount of information in objects is estimated by entropy. The similarity measure of information objects is based on entropy. The paper describes the method of estimating the proximity of text and graphic objects. The paper provides several examples of estimation algorithms implementation. It is shown that the developed method is more efficient compared to the methods described in the literature. The technology used to form images of information objects and to compare their semantic content is universal. It is possible to adapt the technology to the meaningful characteristics of objects being analyzed.

