«Systems and Means of Informatics»
Volume 25, Issue 1, 2015
Content | About Authors
Abstract and Keywords.
GRID AND CLOUD SERVICES SIMULATION AS AN IMPORTANT STEP OF THEIR DEVELOPMENT.
- V. V. Korenkov
- A. V. Nechaevskiy
- G. A. Ososkov
- D. I. Pryakhina
- V. V. Trofimov
- A. V. Uzhinskiy
Abstract: A new system for grid and cloud services simulation is described. It is
focused on improving the efficiency of grid-cloud systems development by using
work quality indicators of a real system to design and predict its evolution. For
these purposes, the simulation programis combinedwith a real monitoring system
of a grid-cloud service through a special database. The simulation principles
and their implementation in the SyMSim software package are described. An
example of using the program to simulate a general cloud structure is given.
Keywords: simulation; distributed data storage; cloud computing; Big Data; optimization; monitoring
COMBINING CORPUS AND THE SAURUS INFORMATION FOR EXTRACTING SENTIMENT WORDS.
- N. V. Loukachevitch
- I. I. Chetviorkin
Abstract: The paper describes a combined approach to extraction of a domain-
specific sentiment lexicon. At first, an initial version of a domain-specific lexicon
is obtained by application of a supervised model. At the second stage, the ordered
list of sentiment words is refined using the thesaurus information. This combined
model is applied to several domains and at last, the domain-specific sentiment
lexicons are united to create an improved version of the Russian sentiment lexicon
in the generalized domain of products.
Keywords: sentiment analysis; domain adaptation; natural language processing; thesaurus
MULTICRITERIA METHOD FOR DETECTING NEAR-DUPLICATES IN A STREAM OF TEXT MESSAGES.
- A. Andreev
- D. Berezkin
- I. Kozlov
- K. Simakov
Abstract: The problem of near-duplicate detection in a stream of text messages
is considered. A model of a text document and a multicriteria duplicate
identification method is proposed. The model provides flexible adjustment for
different domains. The method is based on binary classification using support
vector machine. The paper also provides a method of candidates prefiltration
in order to ensure high efficiency of the approach. Several experiments with
data obtained from a stream of news articles were carried out. The results show
feasibility of the suggested approach.
Keywords: near-duplicate detection; similarity measure; binary classification
CONTROL FLOW BASED TEST SUITE GENERATION.
- N. Voinov
- P. Drobintsev
- I. Nikiforov
- V. Kotlyarov
- I. Selin
Abstract: The article is devoted to description of an approach to test suite
generation in accordance with standard structured coverage criteria based on the
control flow model. The approach is based on automatic test generation with
usage of symbolic verification. The main advantage of the approach is reducing
the number of generated tests obtained due to analysis of control flow data and
reducing the state space for the verification system. The article contains the main
ideas of the approach, the formal model of control flow, and the tools for model
analysis. The results of piloting the approach in a set of projects devoted to
software development are also presented.
Keywords: testing automation; formal model; coverage criterion
ANALYSIS OF UCM-MODEL COVERAGE BY TEST SCENARIOS.
- N. Voinov
- P. Drobintsev
- I. Nikiforov
- V. Kotlyarov
Abstract: The article observes approaches to analysis of UCM-models coverage
by test scenarios generated based on integral coverage criteria. Existing criteria
for automatic generation of test scenarios from high-level UCM-specifications
are reviewed. Two approaches to analysis of UCM-model coverage are proposed:
the automatic one which provides information about covered and uncovered
elements, branches, and paths in one view, and the visual one which allows the
user to explicitly make sure that a UCM-model is covered by test scenarios. The
described approaches are implemented in the analysis tool which significantly
reduces the time needed to create a test set which covers an UCM-model. Future
plans on coverage analysis improvement are also mentioned.
Keywords: test generation criteria; test scenarios; UCM; specifications; analysis
GENERALIZED TABLE-BASED LL-PARSING.
- S. V. Grigorev
- A.K. Ragozina
Abstract: Syntax analysis is an important step of code analysis. The problem is
that the grammars have to be in a form which is deterministic, or at least near-
deterministic for the chosen parsing technique. Generalized parsing algorithms|
Generalized LR and Generalized LL (GLL) | make it possible to remove these
restrictions. Abstract analysis makes it possible to parse embedded languages for
supporting them in IDE, reengineering tasks, or finding vulnerabilities (SQL-
injection). Abstract syntax analysis is based on the classic table-based analysis.
The generalized algorithm of top-down parsing without the use of predictive
tables was described earlier in order to extend the class of languages processed by
descent analyzers. This paper describes an approach to creation of a table-based
GLL-analyzer based on the proposed algorithm, which will be used later for an
abstract analyzer. This article describes the algorithm of generalized top-down
analysis, its modifications, and the results of comparison with the generalized
bottom-up parsing algorithm, which was implemented earlier.
Keywords: generalized parsing; GLL; RNGLR; abstract parsing; string- embedded languages
SYNTHESIS OF STABLE LINEAR PUGACHEV FILTERS AND EXTRAPOLATORS FOR STOCHASTIC SYSTEMS WITH WIDE BAND MULTIPLICATIVE NOISES.
- I. N. Sinitsyn
- E. R. Korepanov
Abstract: The article is dedicated to the analytical synthesis of continuous and
discrete uniquely asymptotically stable conditionally optimal linear Pugachev
filters and extrapolators (LPF and LPE) for stochastic systems (StS) with wide
band multiplicative Gaussian noises. It is supposed that observation is part
of the state and observation equations. The theorems serving as the basis for
the algorithms of synthesis of continuous uniquely asymptotical stable LPF and
LPE are proven. Continuous LPF and LPE for StS with wide band Gaussian
autocorrelated noises are presented. Discrete LPF and LPE for continuous and
discrete StS with wide band multiplicative Gaussian noises are considered. An
illustrative example is given. Some generalizations are considered.
Keywords: accuracy; continuous stochastic system; discrete stochastic system;
linear Pugachev extrapolator; linear Pugachev filter; multiplicative noises;
Riccati equation; unique asymptotical stability; wide band gaussian
ON CONVERGENCE OF RANDOM SUMS OF INDEPENDENT RANDOM VECTORS TO MULTIVARIATE GENERALIZED VARIANCE-GAMMA DISTRIBUTIONS.
Abstract: The purpose of this work is to describe the conditions for convergence
of the distributions for sums of a random number of independent not necessarily
identically distributed multivariate random variables to multivariate normal
variance-mean mixtures, in particular, to multivariate generalized variance-
Keywords: random sum; multivariate normal variance-mean mixture; multivariate generalized hyperbolic distribution; multivariate generalized variance-gamma
distribution; generalized inverse Gaussian distribution; generalized gamma distribution
LARGE CAPACITY OF RAILWAY CARGO TRANSPORTATION FORECASTING.
- R.K. Gazizullina
- M.M. Medvednikova
- V. V. Strijov
Abstract: The article is devoted to research of the algorithm of nonparametric
forecasting of railway cargo transportation capacity. The problem considered is
forecasting the number of wagons with various goods, following various routes.
The topology of the railway network is given | for all possible pairs of railway
lines, information about all blocks of wagons, which have moved from one line
to another, including the number of wagons in a block, the type of cargo, and
the date of the route, is provided. The algorithm, based on convolution of the
empirical density distribution of the values of time series with the loss function
is used for prediction. Previously, forecasting was carried out for each railway
junction separately. It is proposed to be improved by the quality of forecasting
predicting by pairs of lines instead of predicting departure of all wagons from the
given junction. The algorithm is illustrated by the daily data on transportation
of 38 types of cargo collected during a year and a half.
Keywords: forecasting; nonparametric method; railroad station occupancy; loss
function; empirical distribution; compression
SOME APPROACHES TO FORMING THE REGULATORY
AND TECHNICAL BASE FOR THE UNIFIED INFORMATION SPACE OF RUSSIA IN THE FIELD OF INFORMATION RESOURCES.
- A. A. Zatsarinny
- E. V. Kiselev
Abstract: The methodical approaches to forming the regulatory and technical
base for the unified information space of the Russian Federation (UIS RF) in the
field of systematization and interaction of information resources are developed.
It is proposed to create two centralized components of the UIS RF federal level
as a mega system with their own information resources: the control center and
the world information space interaction center. The authors suggest a generalized
model of forming and interaction for secured information resources included in
the unified information space. The general approach to creation, interaction, and
usage of secured information resources at the site of a mega system participant
is described as well. There is a participant-generalized model which includes
a secured information resources general circuit. The circuit includes three
independent circuits for open, confidential, and enclosed information resources.
Some issues of design of profiles of open system environments for participants
of the unified information space are determined. Finally, a model of the process
of creation of an open system environment profile for a participant of the
unified information system is suggested on the basis of the Russian guidance
Keywords: secured information resources; UIS RF control center; world
information space interaction center; UIS RF secured interaction gateways;
All-Russian System of Electronic Interaction; participant's secured information
resources circuits; participant's databank; participant's information archive;
participant's unified information (information and telecommunication) secured
system; participant's open system environment profile design
TECHNOLOGY FOR PREVENTION OF DUPLICATION
OF BIBLIOGRAPHIC DESCRIPTIONS IN THE SCIENTIFIC DATABASE BIAS IPI RAS.
- M. Yu. Zaikin
- V. S. Dolgopolov
- O. L. Obuhova
- I. V. Soloviev
Abstract: The paper considers the developed technology aimed at avoiding
duplication of bibliographic descriptions in the scientific database Bibliographic
Information-Analytical System (BIAS) of IPI RAS. The analysis of the reasons
of duplications is given. The constituent parts of the developed software are the
modules of definition of similarity using the methods of fuzzy search based on
the Oliver algorithm and the modules of visualization of the results which are
built into the system at the level of formation of the database content. Program
modules of visualization allow moderators of BIAS IPI RAS to receive full
information about the conflicts. They will be able to decide on further action
using additional information. The concept of similarity index used in the software
modules of definition of similarity is introduced. The paper considers the formal
data model underlying construction of the database, built on the principles of
facet navigation. Application of the developed software made it possible to detect
and remove duplicate bibliographic descriptions in the scientific database.
Keywords: similarity index; software modules of definition of similarity; method
of fuzzy search on the Oliver algorithm; facet navigation
CREATION OF PATIENT FLOWS CONTROL SYSTEM.
- G. Y. Ilushin
- V. I. Limansky
Abstract: This article deals with the health care system organizational model
regarding rendering of primarymedical care to the population and the hierarchical
three-level structure of medical institutions formed during the process of its
reforming. The analysis of the existing ways of patient routing is carried out,
including patient booking in medical institutions of the first level (the attached
contingent), the redirection of patients to medical institutions of the second and
the third levels by means of coupons and electronic assignments issued by doctors.
The analysis of advantages and disadvantages of these ways is presented. The
electronic control systems of patient flows used for patient routing are examined
as well as some problems of their implementation, in particular, implementation
of EMIAS inMoscow. The analysis of electronic control systems of patient flows
is given. Their main components are defined. The structure of this system based
on the principles of a single point of distributed systems components interaction
is suggested. The processes taking place in the system during realization of the
main precedents are examined. The data flows between components of the system
arising during these processes are defined. The paper also deals with approaches
to organization of electronic documents flow between the medical institutions
based on application of electronic appointments.
Keywords: medical information systems (MIS); electronic control systems of
patient flows; distributed information systems; sequence diagram
INFORMATION RESOURCES FOR CONTRASTIVE STUDIES:
Abstract: This article presents information resources used in contrastive linguistic studies and their principle features. There are two main types of such
information resources: typological databases and electronic text corpora. The
attention is focused on typological databases, which can be subdivided into general-purpose typological databases and specialized typological databases.
General-purpose typological databases act as repositories of a wide range of data
on a wide variety of languages. They may be utilized as reference resources
and may also be helpful while dealing with language classification problems.
Specialized typological databases are used for closer investigation of specific
language phenomena in restricted sets of languages. They supply detailed models
of such phenomena and include examples to illustrate their functioning in the
considered languages. The paper also looks into problems related to development
of typological databases and to integration of data fromheterogeneous typological
Keywords: contrastive linguistic studies; databases; typological databases;
electronic text corpora
ABOUT ACADEMICIAN I. A.MIZIN’S CONTRIBUTION TO THEORY AND PRACTICE OF DOMESTIC INFORMATION-TELECOMMUNICATION SYSTEMS CREATION:
TO THE 80th ANNIVERSARY.
- I. A. Sokolov
- A. A. Zatsarinny
- V.N. Zakharov
The article is devoted to the 80th anniversary of academician
I.A.Mizin, head of IPI RAS during 1989 - 1999, outstanding scientist, designer
and engineer. A brief biography concerning his scientific work is presented. The
scientific and practical contribution of I.A. Mizin to the theory and its applications in creation of domestic information-telecommunication systems (ITS) is
considered in three directions of his work. The first direction is development
and implementation of the data communication system for the purposes of Armed
Force ACS (automated control system), which was the first domestic network
with packet commutation. The second direction is justification of information
technologies for creation of the huge territorial system of data communication in
the regions of Russia. The third direction is creation and development of information networks for the purposes of government with requirements of information
Keywords: chief designer; academician; information telecommunication net-
works; information technologies; data communication system; packet commu-
tation; methods of data communication; information protection; link channels;