Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук



«Systems and Means of Informatics»
Scientific journal
Volume 33, Issue 1, 2023

Content | About  Authors

Abstract and Keywords

SELF-TIMED PIPELINE WITH VARIABLE STAGE NUMBER
  • I. A. Sokolov
  • Yu. A. Stepchenkov
  • Yu. G. Diachenko
  • N. V. Morozov
  • D. Yu. Diachenko

Abstract: The article considers the self-timed circuit's performance improvement problem. As in synchronous circuits, an effective way to improve performance is to use a pipeline to implement multistage input data processing. The article analyzes possible options for dynamical reduction of the number of actively operating stages under certain conditions determined by the processed data value or an external signal. The estimates show that the efficiency of using an optionally variable number of pipeline stages depends on the number of bypassed stages and the probability of an event allowing this bypassing. In particular, replacing two successive pipeline stages with one parallel stage becomes expedient if it occurs in at least 63% of data processing operations and bypassing two or more stages reduces the average pipeline's latency if it occurs in at least 43% of operations.

Keywords: self-timed circuit; pipeline; bypassing; multiplexing; latency; performance

THE PARALLEL CORPORA PERSPECTIVE ON STUDYING CONTRASTIVE PUNCTUATION
  • V. A. Nuriev
  • M. G. Kruzhkov

Abstract: The paper presents the parallel corpora perspective on studying contrastive punctuation. Different languages often use the same punctuation signs in a quite different manner which challenges a translator. Differences in punctuation systems of various languages are studied in the field of contrastive punctuation, a rather underdeveloped branch of linguistics. The translator has to have an accurate understanding of these differences in order to see how particular and individualized the punctuation is in the source text so as to decide how to render it in the target language. To this end, the translator can take advantage of various information resources that result from the joint efforts of information science, computer linguistics, and corpus-based translation studies. One of such resources is supracorpora databases developed at the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences. They store parallel texts (in different languages) from the Russian National Corpus and allow users to collect massive empirical data. The paper shows how this information resource can be used to research contrastive punctuation. Also, it draws some preliminary conclusions on functioning of several punctuation signs in French and Russian.

Keywords: punctuation; contrastive studies; translation; literary translation; asymmetry between languages; corpus-based translation studies; supracorpora database; parallel corpus; French; Russian

INTEGRATION CAPACITIES OF SUPRACORPORA DATABASES
  • A. A. Durnovo
  • O. Yu. Inkova
  • V. A. Nuriev

Abstract: The paper centers on integration capacities of supracorpora databases developed at the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences. It is shown how three databases - the Supracorpora database of hierarchal logical-semantic relations (SDBH LSR), the Database of parallel texts (DBT), and the Supracorpora database of connectives (SDBC) - are integrated between themselves. The information system of hierarchal logical-semantic relations uses a specially designed database (SDBH LSR) in which annotations of logical-semantic relations are presented as trees, i. e., directed connected acyclic graphs where nodes contain data and edges depict the subordination between nodes. Along with the SDBH LSR, ISHLSR uses data from the DBT and SDBC. Such integration makes it possible to combine the methodological strengths of the informatics, contrastive and corpus linguistics, and theory and practice of translation without losing sight of the factors that may adversely affect the validity and reliability of the final data.

Keywords: supracorpora database; integration of databases; multilingual corpus; parallel corpus; corpus-based information resources; translation studies; contrastive linguistics; machine translation

USING A DATABASE OF STRUCTURAL TRANSFORMATIONS TO EXTRACT MULTICOMPONENT TERMINOLOGICAL UNITS
  • Yu. I. Butenko

Abstract: The paper reveals the basic principles of using the database of structural models of terminological phrases for the extraction and alignment of English and Russian multicomponent terms. The formal structure of terminological units and structural models of multicomponent terms in English and Russian are described. The definition of structural transformations in translation is given and the main models of structural transformations of English- and Russianspeaking terms in the translation of scientific and technical texts are highlighted. It is substantiated that the use of structural models of terminological word combinations requires taking into account the specific features of each language under analysis. An approach to the extraction and alignment of terminological units in parallel scientific and technical texts is proposed. It consists of 5 main stages. The approach to the extraction and alignment of terminological units is illustrated by examples of processing the parallel scientific and technical text in Russian and English.

Keywords: structural model; multicomponent term; formal structure of the term; structural models of terminological phrases in translation; Russian-language terminology; English-language terminology

SEMANTIC INTERPRETATIONS OF HIGH NORMAL FORMS OF RELATIONS IN A RELATIONAL DATABASE
  • V. A. Ivanov
  • M. Yu. Konyshev
  • S. V. Smirnov
  • O. V. Tarakanov
  • V. O. Tarakanova
  • S. V. Usovik

Abstract: The results of semantic modeling of the processes of eliminating redundancy and protecting the relation of a relational database from update anomalies by improving the process of its normalization are presented. The semantic equivalents of the requirements of the normal forms of the relational database relations were established which increase the efficiency of the normalization algorithm. A theorem was formulated and proved that guarantees an unambiguous determination of the functional dependence of an attribute on a potential key. Refinements of the relational relation normalization algorithm were formulated. The ways of applying the proposed interpretations are given. Practically applicable procedures are shown that remove the contradictions of designing a rational database structure. The main findings are formulated in relation to factual databases. The proposed mechanisms are independent of the database management system used.

Keywords: relational database; normalization of relation; potential key; nonkey attribute; functional dependence; multivalued dependence

CLASSIFICATION PROBLEM IN CONDITIONS OF DISTORTED CAUSE-AND-EFFECT RELATIONSHIPS
  • A. A. Grusho
  • N. A. Grusho
  • M. I. Zabezhailo
  • A. A. Zatsarinny
  • E. E. Timonina
  • S. Ya. Shorgin

Abstract: The paper considers the model for classifying an object stream for the presence or absence of a certain property A in each object O. It is assumed that there are M bijective transformations of objects coming for classification and in the stream, there are objects obtained from O according to one of these transformations. For each object O, it is known that it contains property A which causes the known objects B1, B2,... ,Bk to appear in information spaces I1,I2,..., Ik. This means that property A can only be detected by observing the consequences B1, B2,..., Bk. The problem is that for each object in the flow, it is necessary to determine the presence or absence of the converted reason A in it. The algorithms for checking such possibility are built in cases where there is a description of characteristics A and when such a description is absent.

Keywords: finite classification task; cause-and-effect relationships; machine learning in distortions conditions

METHODS OF CLASSIFYING THE DISTANCE LEARNING SYSTEM USERS IN THE MODEL OF CONSTRUCTING THEIR PERS ONALIZED LEARNING STRATEGIES
  • Ya. G. Martyushova
  • T. A. Mineyeva
  • A. V. Naumov

Abstract: The article examines the problem of adaptation of the distance learning system to the contingent of users by constructing personalized learning strategies using their previous tests results. The main part of the suggested model is classifying users by various academic progress criteria. Comparative analysis of results of applying different classifiers for this purpose is presented.
The following types of classifiers were used: Bayes classifier, logistic regression, k-nearest neighbors algorithm, decision tree, random forest, boosting, and bootstrap aggregating classifier that uses a majority vote as the voting scheme.
The article presents the results of a numerical experiment using the data on the work of MAI distance learning system CLASS.NET.

Keywords: finite classification task; cause-and-effect relationships; machine learning in distortions conditions

EFFICIENT COMPUTATIONS IN MATRIX FACTORIZATION WITH MISSING COMPONENTS
  • M. P. Krivenko

Abstract: The paper is devoted to the effective implementation of matrix factorization in the presence of missing components into a product of two lower rank matrices. The problem of estimating the parameters of the adopted data model is solved by multidimensional optimization. In practice, the large sizes of the matrices and vectors included in iterative algorithms give rise to the curse of dimensionality. It is proposed to drastically reduce the complexity of matrix operations by presenting them in block-diagonal form. The article substantiates the possibility of casting individual matrices to a block-diagonal form and describes the rules for block-by-block singular value decomposition of matrices. The results of block-by-block processing are illustrated by the example of data matrix factorization of different sizes and with different probabilities of missing components. The time for estimating parameters can be reduced by several orders of magnitude compared to the processing of matrices in the usual representation.

Keywords: lower rank matrix approximation; singular decomposition; missing data; ALS algorithm; block-diagonal representation of a matrix

ON THE PERTURBATION BOUNDS AND THEIR APPLICATION FOR SOME QUEUEING MODELS
  • I. A. Kovalev

Abstract: The service models described by Markov chains with continuous time are considered. One of the known methods is used to study the perturbation and obtain appropriate quantitative of perturbations bounds of (inhomogeneous) Markov chains with continuous time and finite or countable state space. Several specific models are considered. The perturbation bounds of various characteristics of such systems are obtained. The bounds are also considered that can be useful for solving management-related tasks, namely, associated with changing the power of the flow of requirements or the server power so that the average number of requirements in the system is within the specified limits. A numerical example is considered.

Keywords: nonstationary service systems; Markovmodels; perturbation bounds; queuing systems; flow power; server power

THE SOFTWARE-DEFINED NETWORKING IN CONVERGED AND HYPERCONVERGED INFRASTRUCTURES
  • V. B. Egorov

Abstract: The converged (CI) and hyperconverged (HCI) infrastructures are widely discussed now; nevertheless, the terms CI and HCI themselves have not obtained a generally accepted definition. In conditions of uncertainty with the interpretation of the terms, this article attempts to clarify the role of the network in such infrastructures as well as to evaluate the perspective of implementing their internal networks on the software-defined networking (SDN) principles.
The centralized control of all infrastructural components from a single console as a distinctive feature of the CI meets the basic SDN principles. However, this condition is necessary but insufficient and a typical CI network can only be considered as a step towards the SDN. A full-fledged implementation of SDN becomes feasible in the HCI, where the HCI supplier is actually forced, due to hyperconvergence peculiarities, to implement in his product a software-defined storage (SDS) and is able to supplement the product with an SDN network. For ordinary data center owners, the purchase of a hyperconverged infrastructure may be not only the easiest way to acquire an effective SDS but almost the only feasible one to get a ready SDN.

Keywords: converged infrastructure; hyperconverged infrastructure; software- defined networking; CI; HCI; SDN

COMPETENCE CENTERS FOR ARTIFICIAL INTELLIGENCE AND THE NATIONAL TECHNOLOGY INITIATIVE
  • A. P. Suchkov

Abstract: The task of achieving technological sovereignty can be solved by consolidating efforts at the state level, highlighting priority areas and tasks for the development of innovative technologies and identifying ways to achieve them.
In this regard, the autonomous nonprofit organization "Agency for Strategic Initiatives for the Promotion of New Projects" is recommended to prepare a strategic plan for the development of the National Technology Initiative (NTI) for the long term and proposals for monitoring its compliance. The article discusses the issues of forming a holistic picture of the totality of the functions and tasks of the NTI in the interests of this strategic plan and in relation to the problems of artificial intelligence technologies.

Keywords: National Technology Initiative; competence centers; artificial intelligence technologies

THEORY OF S-SYMBOLS: CONCEPTUAL FOUNDATIONS
  • V. D. Ilyin

Abstract: The proposed theory of S-symbols is an extended generalization of the theory of S-modeling. It is considered as a part of the methodological support for the development of artificial intelligence systems in the S-environment (including knowledge systems, systems of S-modeling of tasks and program design, etc.).
The S-environment based on interconnected S-systems (symbols, codes, and signals) serves as the infrastructural basis for the implementation of information technologies for various purposes. The article presents the first part (out of four) of the description of the theory. The rationale for expediency and basic concepts (S-symbol, S-code, S-signal, etc.) are given. The kinds and types of S-(symbols, codes, and signals) are defined. Equivalence, order, and membership relations are introduced, defined on S-systems (symbols, codes, and signals). Definitions are accompanied by examples.

Keywords: theory of S-symbols; S-symbol; S-code; S-signal; S-environment; artificial intelligence; information technology

HYPOTHESIS FORMATION MECHANISM IN THE TECHNOLOGY OF CONCRETE HISTORICAL INVESTIGATION SUPPORT
  • I. M. Adamovich
  • O. I. Volkov

Abstract: The article continues the series of works devoted to the technology of concrete historical research supporting. The technology is based on the principles of co-creation and crowdsourcing and is designed for a wide range of users which are not professional historians and biographers. The article is devoted to the further development of the technology by integrating into it a mechanism that automatically identifies potentially promising areas of research. The proposed approach is to automatically fill in information gaps in a set of facts describing the object of research on the basis of incomplete induction. The analysis of the base for inductive generalization is carried out and the ways of its formation are shown. The possibility of using the data imputation procedure usually used in data analysis and machine learning tasks for this purpose is substantiated. The methods of data imputation are analyzed in the connection with the features of technology and the specifics of concrete historical research. The analysis showed the expediency of the mechanism for automatic hypothesis formation constructing through such method of data imputation as the method of classification trees based on the CHAID (Chi Squared Automatic Interaction Detection) algorithm.

Keywords: concrete historical investigation; distributed technology; formation of hypotheses; information gap; data imputation

BASIC INFORMATION TECHNOLOGIES, INFORMATION WARFARE, AND MULTIDIMENSIONAL HIERARCHICAL TERRITORIAL SOVEREIGNTY: STAGES OF GLOBAL-SPACE COEVOLUTION
  • S. N. Grinchenko

Abstract: From the standpoint of informatics-cybernetic modeling of the process of development of a self-controlling hierarchical-network system of Humankind, the parallelism of the processes of global-space coevolution of basic information technologies (BIT), "multidimensional hierarchical territorial sovereignty" (MHTS), and information warfare (IW) is considered. The period of formation by the subjects of: (i) "courtyards"/families of Hominoidea of signal postures/sounds/movements BIT corresponds to the development of "pre- pre-sovereignty" of the communities of these territories (family specificity) and "pre-pre-IW pre-pre-deception of the enemy"; (ii) "settlements" of Homo erectus of mimics/gestures BIT correspond to the development of "pre-sovereignty" of communities of these territories (generic identity) and "pre-IW pre-deception of the enemy"; (iii) "environments" of Homo sapiens-1 of speech/language BIT corresponds to the development of the MHTS "linguistic tiers" of the corresponding proto-civilizations and "IT of deception of the enemy and the collapse of his "environment"; (iv) "superregions" of Homo sapiens-2 of writing/reading BIT corresponds to the development of the MHTS "cultural-state tiers" of local civilizations and IW for the collapse of the enemy's "superregions;" (v) "supercountries" of Homo sapiens-3 of text replication/printing BIT corresponds to the development of the MHTS "economic tiers" of subcontinental civilizations and IW to destroy the enemy's "supercountries"; (vi)the created Planetary Civilization of Homo sapiens-4 of local computers BIT corresponds to the development of its MHTS "high-tech tiers" and IW for the collapse of its elements formed by the enemy; (vii) the created Civilization of the Near-Earth Space of Homo sapiens-5 of telecommunications/networks BIT corresponds to the development of its MHTS "information tiers" and IW for the collapse of its elements formed by the enemy; (viii) the emerging Civilization of the Intermediate Cosmos of Homo sapiens-6 with a promising nano-BIT corresponds to the development of its MHTS "personal-cosmic tiers" and IW for the collapse of its elements formed by the enemy; etc. It is noted that each new stage of this coevolutionary process does not cancel the results of the previous one but complements and complicates them.

Keywords: basic information technologies; information weapon; information security; multidimensional hierarchical territorial sovereignty; systemic coevolution; informatics-cybernetic model; self-controlling hierarchical-network system of Humankind; principle of systemic cumulation; principle of system consistency