Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук




«INFORMATICS AND APPLICATIONS»
Scientific journal
Volume 10, Issue 1, 2016

Content | About  Authors

Abstract and Keywords.

DATA ACCESS CHALLENGES FOR DATA INTENSIVE RESEARCH IN RUSSIA.
  • L. A. Kalinichenko Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation
  • A. A. Volnova Space Research Institute of the Russian Academy of Sciences, 84/32 Profsoyuznaya Str., Moscow 117997, Russian Federation
  • E. P. Gordov Siberian Center for Environmental Research and Training, Institute of Monitoring of Climatic and Ecological Systems of the Siberian Branch of the Russian Academy of Sciences, 10/3 Akademicheski Av., Tomsk 634055, Russian Federation
  • N. N. Kiselyova A. A. Baikov Institute of Metallurgy and Materials Science of the Russian Academy of Sciences, 49 Leninsky Av., GSP-1, Moscow 119991, Russian Federation
  • D. A. Kovaleva Institute of Astronomy of the Russian Academy of Sciences, 48 Pyatnitskaya Str., Moscow 119017, Russian Federation
  • O. Yu. Malkov Institute of Astronomy of the Russian Academy of Sciences, 48 Pyatnitskaya Str., Moscow 119017, Russian Federation
  • I. G. Okladnikov Siberian Center for Environmental Research and Training, Institute of Monitoring of Climatic and Ecological Systems of the Siberian Branch of the Russian Academy of Sciences, 10/3 Akademicheski Av., Tomsk 634055, Russian Federation
  • N. L. Podkolodnyy Center for Bioinformatics, Federal Research Research Center Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, 10 Acad. Lavrentyeva Av., Novosibirsk 630090, Russian Federation
  • A. S. Pozanenko Space Research Institute of the Russian Academy of Sciences, 84/32 Profsoyuznaya Str., Moscow 117997, Russian Federation
  • N. V. Ponomareva Research Center of Neurology, 80 Volokolamskoe Shosse, Moscow 125367, Russian Federation
  • S. A. Stupnikov Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • A. Z. Fazliev Integrated Information Systems Center, Institute of Atmospheric Optics of the Siberian Branch of the Russian Academy of Sciences, 1 Acad. Zuev Sq., Tomsk 634055, Russian Federation

Abstract: The goal of this survey is to analyze the global trends of development of massive data collections and related infrastructures in the world aimed at the evaluation of the opportunities for the shared usage of such collections during research, decision making, and problem solving in various data intensive domains (DIDs) in Russia. The representative set of DIDs selected for the survey includes astronomy, genomics and proteomics, neuroscience (human brain investigation), materials science, and Earth sciences. For each of such DIDs, the strategic initiatives (or large projects) in the USA and Europe aimed at creation of big data collections and the respective infrastructures planned up to 2025 are briefly overviewed. The information technology projects aimed at the development of the infrastructures supporting access to and analysis of such data collections are also briefly overviewed. The set of large data collections included into the survey and expected to be created soon is planned to be used as a reference point for the design and development of the research infrastructures for data management and analysis making them compatible with the foreign open research infrastructures. In particular, the data collections considered in the survey, the goals of their creation and the researches planned to be accomplished based on them make it possible to proceed to the design and implementation of the advanced components of the research infrastructures, such as, for example, conceptualization facilities of the application domains to be investigated in data intensive research, respective metamodels, components intended for data reuse and reproducing of programs and workflows, etc.

Keywords: fourth paradigm; data intensive domains; research infrastructures; data collections; big data

CO-LENDING SYSTEMIC RISK ANALYSIS OVER HETEROGENEOUS DATA COLLECTIONS.
  • S. A. Stupnikov Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • D. O. Briukhov Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • N. A. Skvortsov Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation

Abstract: The paper considers an approach to solving the problem of co-lending systemic risk analysis over heterogeneous data collections in a combined virtual and materialized integration environment. The problem belongs to the data-intensive domain of financial macromodeling. Virtual integration is implemented using the subject mediation technology. Materialized integration is implemented using the Hadoop open-source software framework for distributed storage and processing of large datasets accompanied by the Hive system intended for relational warehousing over Hadoop.

Keywords: co-lending systemic risk; problem solving; data integration; heterogeneous data collections

ORTHOGONAL SUPOPTIMAL FILTERS FOR NONLINEAR STOCHASTIC SYSTEMS ON MANIFOLDS.
  • I. N. Sinitsyn  Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation

Abstract: The authors developed the synthesis theory of suboptimal filers (SOF) based on normal approximation method (NAM), statistical linearization method (SLM), orthogonal expansions method (OEM), and quasi-moment method (QMM) for nonlinear differential stochastic systems on manifolds (MStS) with Wiener and Poisson noises.
Exact optimal (for mean square error criteria) equations for MStS with Gaussian noises in observation equations for the one-dimensional a posteriori characteristic function are derived. Problems of approximate solving of exact equations are discussed. Accuracy and sensitivity equations are presented. A test example for the nonlinear scalar differential equation with additive and multiplicative noises is given. Some generalizations are mentioned.

Keywords: a posteriori one-dimensional distribution; coefficient of orthogonal expansion; first sensitivity function; normal approximation method; normal suboptimal filter; orthogonal expansion method; orthogonal suboptimal filter; quasi-moment method; quasi-moment; statistical linearization method; stochastic system on manifolds; suboptimal filter; Wiener white noise

ANALYTICAL MODELING OF DISTRIBUTIONS IN STOCHASTIC SYSTEMS ON MANIFOLDS BASED ON ELLIPSOIDAL APPROXIMATION.
  • I. N. Sinitsyn  Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • V. I. Sinitsyn  Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation

Abstract: Accuracy and sensitivity problems for algorithms of structural analytical modeling of one- and multidimensional distributions based on the method of ellipsoidal approximation (MEA) and the method of ellipsoidal linearization (MEL) are considered. General algorithms of ellipsoidal stochastic analysis in nonlinear stochastic systems on manifolds (MStS) with Wiener and Poisson noises are developed. Special attention is paid to MStS with additive noises. Accuracy and sensitivity equations are presented. For two-dimensional nonlinear circular MStS, accuracy and sensitivity equations are derived. Equations make it possible to calculate probability moments up to the fourth order. Algorithms based on orthogonal expansion of one-dimensional density are compared with algorithms based on MEA (MEL). Some generalizations are given.

Keywords: accuracy equations; analytical modeling; method of ellipsoidal linearization (MEL); method of ellipsoidal approximation (MEA); sensitivity equations; stochastic system on manifold (MStS)

METAPROGRAMMING TO INCREASE MANUFACTURABILITY OF LARGE-SCALE SOFTWARE-INTENSIVE SYSTEMS.
  • S. P. Kovalyov Institute of Control Sciences, Russian Academy of Sciences, 65 Profsoyuznaya Str., Moscow 117997, Russian Federation

Abstract: An approach to reduce costs of large-scale software-intensive systems design due to applying modern metaprogramming technologies is proposed. Model-driven engineering and aspect-oriented software development are considered to be the most advanced among such technologies. The methods to scale these technologies are presented in order to apply them efficiently under growth of the target system size via closure with regard to basic structural relations. Design of mathematical software for smart electric grids is considered as a case study for practical applications of the approach. Principles of mathematical device for constructing, analysis, and optimization of design technological procedures based on the category theory are described. The process to design the generator of computational software components of large-scale systems applying category-theoretical methods is drawn.

Keywords: large-scale software-intensive systems; metaprogramming; megamodel; category theory; colimit; model driven engineering; aspect-oriented software development; smart grid

BAYESIAN QUEUEING AND RELIABILITY MODELS: A PRIORI DISTRIBUTIONS WITH COMPACT SUPPORT.
  • A. A. Kudryavtsev  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation, Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation

Abstract: This work is the latest in a series of articles devoted to the study of Bayesian queueing and reliability models.
The paper presents relations for the distribution function and the density of the quotient р of independent random variables with a priori distributions with compact support, which are interpreted as a parameter "obstructing" the functioning of the system and a parameter "conducing" to the functioning of the system. Description of the life cycle of many real systems is carried out in terms of р; for example, in the queueing theory, parameter р is called the "system load factor" and is a part of many formulas that describe various characteristics. The paper considers particular cases of a priori distributions with compact support for which densities have polynomial or piecewise polynomial form.

Keywords: Bayesian approach; mass service theory; reliability theory; mixed distributions; distributions with compact support

CONCEPT OF ONLINE SERVICE FOR STOCHASTIC MODELING OF REAL PROCESSES.
  • A. K. Gorshenin  Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation, Moscow Technological University (MIREA), 78 Vernadskogo Ave., Moscow 119454, Russian Federation

Abstract: Information flows analysis based on various probabilistic models is widely used in various applied fields.
The article describes the basic principles of construction of the new online system for stochastic modeling of real processes, which has no analogue due to the universality of the set of methods and the concept of an Internet resource; so, the end user should not check personal computer's specifications and could upload data to the server immediately and then process the samples.

Keywords: probability mixtures; moving separation of mixtures; data mining; online software; matrix computing

DEVELOPMENT OF THE ALGORITHM OF NUMERICAL SOLUTION OF THE OPTIMAL INVESTMENT CONTROL PROBLEM IN THE CLOSED DYNAMICAL MODEL OF THREE-SECTOR ECONOMY.
  • P. V. Shnurkov National Research University Higher School of Economics, 34 Tallinskaya Str., Moscow 123458, Russian Federation
  • V. V. Zasypko National Research University Higher School of Economics, 34 Tallinskaya Str., Moscow 123458, Russian Federation
  • V. V. Belousov Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • A. K. Gorshenin  Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation, Moscow Technological University (MIREA), 78 Vernadskogo Ave., Moscow 119454, Russian Federation

Abstract: The paper develops the numerical method of solution of the optimal investment control problem in the closed dynamical model of three-sector economy. The preceding papers described an analytical research of this problem by the method based on the Pontryagin maximum principle. In the present paper, the authors obtained analytical representations for state functions. Conjugate variables are used as the foundation of the numerical algorithm. The developed algorithm makes it possible to analyze the class of admissible control functions, having not more than the given finite number of points of switch, and to find among them those that satisfy the necessary optimality conditions and restrictions of the original task. The general scheme of the proposed algorithm can be used to investigate another optimal control tasks, connected with different subject areas. The developed algorithm is realized in a system of applied programs.

Keywords: model of three-sector economy; Pontryagin maximum principle; numerical method of solution of the optimal control problem

FINE-GRAINED HYBRID INTELLIGENT SYSTEMS. PART 2: BIDIRECTIONAL HYBRIDIZATION.
  • I. А. Kirikov Kaliningrad Branch of the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 5 Gostinaya Str., Kaliningrad 236000, Russian Federation
  • А. V. Kolesnikov Kaliningrad Branch of the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 5 Gostinaya Str., Kaliningrad 236000, Russian Federation, Immanuel Kant Baltic Federal University, 14 Nevskogo Str., Kaliningrad 236041, Russian Federation
  • S. V. Listopad Kaliningrad Branch of the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 5 Gostinaya Str., Kaliningrad 236000, Russian Federation
  • S. B. Rumovskaya Kaliningrad Branch of the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 5 Gostinaya Str., Kaliningrad 236000, Russian Federation

Abstract: The problematic of interdisciplinary tools is considered and the conclusion about the relevance of research of the "grain" property of hybrids in informatics is made. Properties of functional and instrumental heterogeneity of complex tasks are investigated and the results of fine-grained hybrids modeling within the theory of the schemes of role conceptual models are presented. The results are presented within the linguistic approach, the core of which is in transformation of the verbalized information about objects-originals (complex subjects) and objects-prototypes (modeling approaches) to objects-results (functional hybrid intelligent systems). It exists in poly-languages of professional activity. The transformation is directed by heuristics which are the schemes of the conceptual role models in the informal axiomatic theory. The categorical core of the theory is "resource- property-operation-relation." The notion of bidirectional hybridization, its benefits, and the first results are represented.

Keywords: logical-mathematical intelligence; hybrid intelligent systems; linguistic approach; theory of role conceptual models; fine-grained hybrids; bidirectional hybridization

REPRESENTATION OF CROSS-LINGUAL KNOWLEDGE ABOUT CONNECTORS IN SUPRACORPORA DATABASES.
  • I. M. Zatsman Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • O. Yu. Inkova Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation, University of Geneva, 22 Bd des Philosophes, CH-1205 Geneva 4, Switzerland
  • M. G. Kruzhkov Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
  • N. A. Popkova Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation

Abstract: The article considers "supracorpora databases," which are used in contrastive linguistic studies. Such databases result from processing of parallel texts from bilingual parallel subcorpora within the Russian National Corpus. Each of these parallel texts contains either one original Russian text with one or more translations into a foreign language, or one original text in a foreign language with one translation into Russian. Every source text is aligned with its translation(s) at the level of sentences. Supracorpora databases are a new type of linguistic resources designed for goal-oriented discovery of new knowledge about various linguistic units. This knowledge is needed to improve the quality of machine translation, to update monolingual and bilingual grammars, and to modernize a wide range of academic courses in such fields as linguistics and translation studies. The article describes the underlying conceptual foundations of the database and gives an example of how it can be implemented to represent knowledge about Russian connectors and their French translation correspondences.

Keywords: cross-lingual studies; Russian connectors; representation of knowledge about connectors; supracorpora databases

BioNLP ONTOLOGY EXTRACTION FROM A RESTRICTED LANGUAGE CORPUS WITH CONTEXT-FREE GRAMMARS.
  • D. A. Alexeyevsky National Research University Higher School of Economics; 20 Myasnitskaya Str., Moscow 101000, Russian Federation

Abstract: BioNLP is an emerging area of NLP that brings new challenging objects for language processing and new valuable resources for bioinformatics and medicine. One notable task in BioNLP is creating de-novo ontologies.
This is generally a tedious process; however, in some cases, it is possible to automate it to some extent. One such case is when a corpus of texts in a restricted subset of natural language is available. This paper presents a simple approach to automate ontology creation in such cases. The approach is aimed to simplify mapping of entities in natural texts to predefined ontologies wherever possible. The paper discusses which properties of the corpus enable the approach presented.

Keywords: BioNLP; ontology creation; context-free grammar

COMPLEXITY AND ITS INFORMATION CONTENT.
  • N. Callaos International Institute of Systemic, Cybernetics and Informatics, USA-Venezuela, 2206 Tillman Av., Winter Garden, FL 34787, USA
  • R. Seyful-Mulyukov  Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation

Abstract: The word 'information' has been used in many senses and its related concepts have been defined in different ways. One of the senses in which the word is used relates to a concept which is considered one of the main properties of matter. The definition of this conception of information supports the expression of concepts such as Complexity and Self-Organization. In this paper, Complexity and Self-Organization concepts are applied to systems at the macro- and microlevels. Their similarities and differences are analyzed and information content is considered. The regularities of Complexity and Self-Organization are applied to petroleum as a complex natural thermodynamic system. Petroleum reflects all of the main and widely understood features of Complexity and Self-Organization but demonstrates additional properties which were not considered earlier. Complexity and Self-Organization can help to deepen our understanding of the origin of hydrocarbon molecules, their age, and behavior in the process of petroleum generation in general.

Keywords: complexity; complexity properties; complex system; self-organization; artificial complexity; natural complexity; petroleum origin; hydrocarbon molecule complexity; petroleum information content