Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук




«INFORMATICS AND APPLICATIONS»
Scientific journal
Volume 16, Issue 4, 2022

Content | About  Authors

Abstract and Keywords

SYNCHRONOUS AND SELF-TIMED PIPELINE'S RELIABILITY ESTIMATION
  • I. A. Sokolov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • Yu. A. Stepchenkov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • Yu. G. Diachenko  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • Yu. V. Rogdestvenski  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Self-timed (ST) circuitry is an alternative to synchronous circuits. Self-timed circuits have a number of advantages over their synchronous counterparts due to their redundant complexity. The article investigates the immunity of self-timed and synchronous circuits to single short-term soft error taking into account the hardware redundancy of ST circuits. Self-timed circuits, due to their indication subcircuit, are able to detect a soft error which occurs as a logical cell's output state inversion and suspend the operation of the circuit until the soft error disappears. Thus, ST circuits mask a single soft error and prevent distortion of the data processing result. The use of a modified hysteretic trigger, which prevents sticking in the antispacer, to implement a pipeline stage register bit masks almost all soft errors in the pipeline stage's combinational part. The DICE-like implementation of this trigger makes it possible to reduce the sensitivity of the ST register to the internal soft errors by a factor of 4. Quantitative estimates of failure tolerance show a clear (by 2.5-9.4 times) advantage of the ST pipeline in comparison with the synchronous counterpart.

Keywords: self-timed circuit; soft error; failure tolerance; pipeline; indication; probabilistic estimate

TOTAL APPROXIMATION ORDER FOR MARKOV JUMP PROCESS FILTERING GIVEN DISCRETIZED OBSERVATIONS
  • A. V. Borisov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The note proceeds the investigation devoted to the numerical approximation of the Markov jump process filtering given both the counting and diffusion observations with the multiplicative noise. The filtering estimates are approximated using the observations, previously discretized by time. By contrast with the previous algorithms which limit the number of the Markov state transitions that occurred during the time discretization interval, the new estimates are free of these restrictions and constructed via a unified scheme. The note presents an upper bound for the approximation accuracy as a function of the observation system parameters, applied scheme of the numerical integration, the time discretization step, and the filtering moment. A numerical example illustrates a sublinear character of the bound towards the latter argument.

Keywords: Markov jump process; optimal filtering; diffusion and counting observations; multiplicative observation noise; numerical approximation accuracy

UNBIASED THRESHOLDING RISK ESTIMATE WITH TWO THRESHOLD VALUES
  • O. V. Shestakov  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Leninskie Gory, GSP-1, Moscow 119991, Russian Federation, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Moscow Center for Fundamental and Applied Mathematics, M.V. Lomonosov Moscow State University, 1 Leninskie Gory, GSP-1, Moscow 119991, Russian Federation

Abstract: The problems of noise reduction in signals arise in many application areas. In cases where the signals are not stationary, noise suppression methods based on wavelet transform and thresholding procedures have proven themselves well. These methods are computationally efficient and adapt well to the local features of the signals.
The most common types of thresholding are hard and soft thresholding. However, when using hard thresholding, estimates with large variance are obtained, and soft thresholding leads to additional bias. In an attempt to get rid of these shortcomings, various alternative types of thresholding have been proposed in recent years. This paper considers a thresholding procedure with two thresholds which behaves as soft thresholding for small values of wavelet coefficients and as hard thresholding for the large ones. For this type of thresholding, an unbiased estimate of the mean-square risk is constructed and its statistical properties are analyzed. An algorithm for calculating the threshold values that minimizes this estimate is described.

Keywords: wavelets; thresholding; unbiased risk estimate

GENERALIZATION OF A METHOD FOR STRAIGHTENING COEFFICIENTS DISTORTED DUE TO MULTICOLLINEARITY IN REGRESSION MODELS WITH DIFFERENT DEGREES OF EXPLANATORY VARIABLES CORRELATION
  • M. P. Bazilevskiy  Irkutsk State Transport University, 15 Chernyshevskogo Str., Irkutsk 664074, Russian Federation

Abstract: When constructing regression models, one of the main problems is multicollinearity. This negative phenomenon leads to distortion of the regression coefficients, in particular, their signs. Earlier, to solve the problem of multicollinearity, a method for straightening distorted coefficients was developed which is based on the construction of a fully connected linear regression model. One of the conditions for its applicability is a close correlation of absolutely all pairs of explanatory variables. But when solving real applied problems, this condition is rarely met. Most often, explanatory variables correlate with each other in different ways. The authors propose a new iterative algorithm for the method of straightening distorted coefficients. A feature of the algorithm is that it combines the advantages of both traditional multiple models and new fully connected regressions. The developed algorithm is universal and can be used to construct a regression equation with any structure of the correlation matrix. The new algorithm has been successfully applied to simulate freight transportation by rail in the Irkutsk region.

Keywords: regression analysis; correlation; multicollinearity; method for straightening distorted coefficients; fully connected linear regression model

ON BOUNDS OF THE STATIONARY WAITING TIME EXTREMAL INDEX IN M/G/1 SYSTEM WITH MIXTURE SERVICE TIMES
  • I. V. Peshkova  Petrozavodsk State University, 33 Lenina Prosp., Petrozavodsk 185910, Russian Federation, Karelian Research Centre of the Russian Academy of Sciences, 11 Pushkinskaya Str., Petrozavodsk 185910, Russian Federation

Abstract: It is proved that if the original stationary sequence has m-component mixture distribution with stochastically ordered components, there are limit distributions for the maxima of all components, and the normalizing sequences are ordered, then the extremal index of the original sequence is within the boundaries of the extremal indexes of the smallest and largest components. This result is used to estimate the extremal index of the stationary waiting time in a queuing system of type M/G/1 in which the queuing time is given by an m-component distribution mixture. An example of a system M/Hm/1 with hyperexponential service time is considered. Using the exact simulation approach, the results of estimating the extremal index of stationary waiting time in the system M/H2/1 are obtained.

Keywords: extreme value distributions; extremal index; queueing system; stochastic ordering

OPTIMAL CONTROL OF A QUEUE-LENGTH DEPENDENT ADDITIONAL SERVER IN GI/M/1 QUEUE
  • Ya. M. Agalarov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Consideration is given to a GI/M/1 queue in which there is an additional server available for serving customers from the queue. The additional server can be turned on and off depending on the current queue length.
The long-run total cost per unit time, equal to the difference between the paid amount for service and the losses due to the waiting of customers and additional server depreciation, is being optimized. The case of finite queue capacity is also considered in which the losses also account for lost customers. It is proved that the cost function considered is unimodal. Necessary and sufficient conditions are given for the existence of the decision point (queue length) at which application of the additional server is optimal. A simple algorithm for controlling the decision point, requiring only observations of the cost function value, is provided.

Keywords: queuing system; redundancy; management; optimization

ON THE OPTIMAL ANTENNA DEPLOYMENT FOR SUBTERAHERTZ V2X COMMUNICATIONS
  • E. A. Machnev  Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
  • V. A. Beschastnyi  Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
  • D. Yu. Ostrikova  Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
  • Yu. V. Gaidamaka  Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • S. Ya. Shorgin  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Subterahertz (sub-THz, 100-300 GHz) communication should provide huge data transfer rates in 6G systems. However, the coverage area of base stations (BS) will be very limited, since the signal is quite strongly attenuated from the distance and is also easily blocked by the presence of any objects in the signal path. Thus, the BS will need to be located too often which is a costly process. To reduce the deployment density of the BS, a mechanism was proposed for relaying the signal using vehicles (V2V). This relaying method is characterized by various options for the location of the antenna on vehicles which raises the question of finding the optimal location. In this work, guided by the IEEE 802.15.3d specification and measurements of the signal propagation level at a frequency of 300 GHz, the authors developed a mathematical model for comparing multihop signal relay systems with different antenna locations. The authors consider the following quality of service indicators: coverage,
BS availability, and data transfer rate. The results show that the windshield transmitter location has a lower data rate but more coverage while the bumper and engine levels show similar performance. A windshield location is recommended as it is less sensitive to the rate of technology integration and has a larger coverage area.

Keywords: 5G; New Radio; V2V; V2X; multihop communications

FUZZY AVERAGING OPERATORS IN THE PROBLEM OF AGGREGATING FUZZY INFORMATION
  • V. L. Khatskevich  N. E. Zhukovsky and Y.A. Gagarin Air Force Academy, 54a Old Bolsheviks Str., 394064 Voronezh, Russian Federation

Abstract: The problem of aggregating fuzzy information by constructing fuzzy averaging operators is considered. Weighted fuzzy averages of systems of fuzzy numbers are studied and a class of nonlinear fuzzy averages of systems of fuzzy numbers is introduced which is a modification to fuzzy numbers of the general class of dissipative numerical averages. The properties of the corresponding averaging operators which are "fuzzy" analogues of the characteristic properties of scalar aggregating functions, are established. This provides a justification for the use of the introduced fuzzy averaging operators in the problem of aggregation of fuzzy information. At the same time, the result of aggregation of fuzzy information given by a set of fuzzy numbers is understood as a fuzzy number that reflects the essential features of this set.

Keywords: averaging fuzzy operators; aggregation of fuzzy information

0N THE C0MPLEXITY 0F L0GICAL CLASSIFICATI0N LEARNING PR0CEDURES
  • E. V. Djukova  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. P. Djukova  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The issues of integer data logical analysis complexity are investigated. For special tasks of searching in data for frequent and infrequent elements, on the solution of which logical supervised classification procedures are based, asymptotics of a typical number of solutions are given. The technical foundations for obtaining these estimates are based on methods for obtaining similar estimates for intractable discrete problem of constructing (enumerating) irredundant coverings of integer matrix formulated in the paper as the problem of finding "minimal" infrequent elements. The new results mainly concern the study of metric (quantitative) properties of frequent elements. The obtained estimates for the typical number of frequently occurring fragments in precedent descriptions allow one to conclude that the use of algorithms for finding such fragments at the stage of training logical classifiers of the "Kora" type is promising.

Keywords: attribute; frequent elementary fragment; infrequent elementary fragment; monotone dualization; irredundant covering of integer matrix; supervised classification; classifier of "Kora" type

TECHNOLOGY FOR CLASSIFICATION OF CONTENT TYPES OF E-TEXTBOOKS
  • A. V. Bosov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. V. Ivanov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The problem of automatic classification of the educational content of the e-learning system, represented by tasks or practical examples, is being solved. A promising direction in the development of e-learning systems is the assessment of the quality of educational content. Carrying out such an assessment is the rationale for the need to create an automated classifier. The main idea is to model the content with an object with two properties - a textual description in natural language and a set of formulas in the language of scientific computer layout TgX. Using tasks from the electronic textbook on the theory of functions of a complex variable, a data set was prepared and labeled in accordance with this model. Four text classification algorithms were trained - naive Bayes classifier, logistic regression, single-layer and multilayer feedforward neural networks. For these classifiers, a number of comparative experiments were carried out comparing the classification accuracy using text content only, formula content only, and the full model. As a result of the experiment, not only a formal comparison of the algorithms was carried out but also the fundamental advantage of the full model was shown. That is, when using both textual description and representation of formulas in the TjXlanguage, the classification accuracy significantly exceeds one-factor algorithms and confirms the readiness of the technology for practical application.

Keywords: e-learning system; training content; classification tasks and algorithms; content quality assessment; machine learning

ON THE SCIENTIFIC PARADIGM OF INFORMATICS: THE CLASSIFICATION HIGH LEVEL OF ITS OBJECTS
  • I. M. Zatsman  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The approach of A. Solomonik to structuring the scientific paradigm of "mature" science is considered. According to his approach, the description of such a science should include four components (philosophical foundations; axiomatics; classification of its objects; and system of terms) which can be developed separately but combined into a single and integral structure. Within the framework of this approach, it is proposed to begin the description of the paradigm of informatics by clarifying its positioning in united science (= science U humanities) and constructing the classification high level ofits objects. To position informatics, it is proposed to develop the idea of Denning and Rosenbloom about grouping scientific disciplines in Four Great Scientific Domains. To build the high level of classification, Kristen Nygaard's idea of distinguishing objects of mental nature (concepts of human knowledge) and sensory-perceived objects is used. The purpose of the paper is to attempt to begin the description of the scientific paradigm of informatics based on the approach of A. Solomonik and the development of the ideas of Denning, Nygaard, and Rosenbloom with the construction of the high level of classification.

Keywords: scientific paradigm; scientific paradigm components; united science; classification informatics objects

MODEL AND TECHNOLOGY FOR DISCOVERING NEW TERMS IN MEDICAL TEXTS
  • I. M. Zatsman  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • O. V. Zolotarev  Russian New University, 22 Radio Str., Moscow 105005, Russian Federation
  • A. K. Khakimova  Russian New University, 22 Radio Str., Moscow 105005, Russian Federation
  • Gu Dongxiao  Hefei University of Technology, 193 Tunxi Road, Hefei, Anhui 230009, PR. China

Abstract: The model of information technology for discovering new terms in medical texts, which belongs to the previously defined class of informatics' medium models, is considered. In the conducted experiment, the MeSH (Medical Subject Headers) dictionary is used to determine the novelty of terms which was created and updated by the National Library of Medicine (USA). The emergence of new terms is due to the representation (in medical papers and other scientific texts) of new knowledge about the studied diseases, methods of their treatment, and medicines used which has not yet been reflected in medical dictionaries and thesauri. In information systems of medical institutions, the proposed technology makes it possible to regularly update the profiles of the studied diseases corresponding to their subject domain. The aim of the paper is to describe the medium model of information technology for updating terminological profiles of diseases.

Keywords:  medium models in informatics; medical texts; terminological profile; discovering new terms in texts

ABOUT THE SECURE ARCHITECTURE OF A MICROSERVICE-BASED COMPUTING SYSTEM
  • A. A. Grusho  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. A. Grusho  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • M. I. Zabezhailo  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • D. V. Smirnov  Sberbank of Russia, 19 Vavilov Str., Moscow 117999, Russian Federation
  • E. E. Timonina  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • S. Ya. Shorgin  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The paper discusses a network-centric microservice architecture system in which all microservice computers are the same for simplicity. Each microservice computer may fail or receive malicious code. The maximum negative impact on the microservice computer is a calculation error and providing the consumer with the wrong result. The tasks of detecting failed microservice computers and detecting microservice computers with malicious code are considered. In solving the set tasks, elements of training are used. Correctly solved problems (conditions, source data, and correct answers) are accumulated in the memory of the control system. This means that one can restart any task with an already known correct result. At the same time, the article uses the ideas and results of the present authors to ensure information security while using metadata. Depending on the assumptions about the possible actions of malicious code, two classes of secure computing algorithms are built in the context of its possible impact on intermediate results in the flow of solved problems. The second class of algorithms works in the assumption that malicious code can correctly calculate the solution to the current problem with probability p and introduce distortion into the result with probability 1 - p. The authors consider three types of distortions that malicious code can introduce and which allow one to either find the true solution accurately or with low probability of error

Keywords: information security; secure computing under malicious code conditions; microservice architecture

LOGICAL RELATIONAL MODEL OF DATA STRUCTURES FOR PROBLEM SOLVING IN LAND USE MANAGEMENT
  • D. O. Briukhov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • S. A. Stupnikov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The paper belongs to theoretical foundations of agroinformatics. To be more precise, it concerns development of methods and tools for automatization of land use management. A logical model for data structures intended for problem solving in the subject area is proposed. Registry for agroecological groups and types of lands organizing diversity of their geomorphological, soil, and agrochemical conditions, registry for crops and cultivars, and registry for agrotechnologies as complexes of technological operations for management of crop production processes are formalized as sets of relations of the relational data model and relationships between them. Application of the logical model is demonstrated via several generic problems of land use management. Each problem is implemented as a declarative query expression over the logical model. The model is implemented in a relational database management system.

Keywords: logical relational model; land use management; problem solving

UNIFIED MODEL OF NATIONAL DATA: DEVELOPMENT SCENARIOS
  • A. P. Suchkov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The problem of information interaction between heterogeneous information systems is considered. Such point of view is taken that this problem must be solved by creating, implementing, and maintaining unified data models within the main sections of the subject areas. In the perspective, the scope must be enlarged so as to encompass the entire subject area of information interaction on a national scale. Based on the ontological approach, the author proposes the solution to the problem of finding effective as well as optimal ways to form national data models. The scenarios of integration of departmental systems are considered as well.

Keywords: information interaction; unified data model; ontology; integration scenarios