• Systems and Means of Informatics

    2014, Volume 24, Issue 1, pp 224-243


    • M.M. Gershkovich
    • T.K. Birukova


    An approach for identification of informational objects (IO) in automatic informational systems employed for data collection, storage, and processing is presented. Information systems consist ofmultiple nodes and acquire data from multiple sources. In majority of cases, a data array of informational systems is presented as continuously filled event's diary. Each event's record includes characteristics of the event's participant | IO | and of the event's conditions. In order to solve analytical problems related to IO, one should identify IO, i. e., define the array of IOs that are, with certain probability, the same entity. The paper defines typical IO identification tasks for elaboration of large-scale informational systems: IO fusion and IO clustering | forming an aggregate of IOs similar with respect to certain criteria. The identification task is closely connected to the task of identification of links between IOs, as the probability of IO's identity is higher if each IO is associated with another object. The methods for solving these tasks are presented, special features of IO identification in the flow of events are studied, and the correlation search method for detection of associations between IOs is described. The method for comparison of proper names considering probable distortions (phonetic and transcriptional) and misprints is presented. The efficacy of simultaneous Cyrillic and Latin first name { second name blocks application for personal identification is substantiated and the methods for translation from Cyrillic to Latin and vice versa are presented.

