«Annual Proceedings of the Institute of Informatics Problems (IPI RAN)
The Systems and Means of Informatics, Issue 17, 2007
Оглавление | Аннотации на русском языке
The 17th issue of the proceedings continues the tradition of presenting the main results of R&D activities of the Institute of Informatics Problems (IPI RAN).
The proceedings contain a selection of scientific papers on the most significant results of IPI RAN's work in the following research areas: information-telecommunication systems and networks, modelling, the informatization of society, information technologies, architecture and system solutions of computation complexes and new generation networks.
The collection of works is intended for researchers, engineers and post-graduates interested in the current state of R&D activities in informatics and computer engineering.
I. Information-telecommunication systems and networks, their modeling, computerization of the society
- Alexander Zatsarinnyi, Yuri Ionenkov
Some aspects of information-telecommunication networks (ITN) formation
The authors present a general approach to selection of appropriate technologies for deployment of up-to-date information-telecommunication networks. The paper describes the key features of technologies for multiservice networks formation, suggests criteria for their selection and provides the results of their comparative analysis.
- Yaver Agalarov
Dynamic distribution policy of computational resources of the local Grid node
In this paper the author proposes the function for calculating the local priorities for multiple processor jobs in the Grid environment. The proposed function defines the dependence of minimal requisite fare for jobs execution based on the present condition of resources. For this reason local priority of incoming job depends on the fare payable by the user for using the resources and minimal requisite fare for jobs accomplishment.
Recognition of image object with different picture size elements
- Mikhail Krivenko
The problem of textual image recognition when some symbols have different picture size is considered. The author presents a new statistical image object recognition method based on the use of Gaussian mixture densities in the context of the Bayesian decision rule. In addition to this, the methods based on the distance between images and classes are examined. To assess the threshold in the measuring method of the correspondence between partitions of a finite set of objects, a solution based on mixture distribution division is presented. The ideas are illustrated with an example of real text recognition procedure.
- Boris Rabinovich
Cluster analysis of telephone communications detalization
The article describes one of the types of multidimensional statistical analysis - cluster analysis. The research is focused on applying different combinations of metrics and algorithms for cluster analysis of telephone calls billing. Billing information is one of the data sources in logical-oriented analytical system "Analyst". This task is considered to be of practical interest because the research results can be be used for solving such important tasks as detecting criminal formations in criminalistics, development of new pricing plans and quotations in mobile telephony, revealing the groups of accounts in the banking sector, etc.
- Grusho Alexander., Grusho Nick., Elena Timonina
About security and safety of security subsystems in distributed information systems
UML can be used in security subsystem development. To achieve security subsystem safety different methods of redundancy insertions should be analyzed. Probability models are used for this purpose.
- Marina Archipova, Sofia Gutman
Statistical monitoring of information technologies development in Russia
In this article the main tendencies of informatisation in Russia in the period 2001 - 2005 are analyzed. According to the existing international practice the main attention is given to research of the dynamics of development of information technologies applicable to various types of economic activities. Special attention is paid to studying IT activities in the scientific research and development sector.
The paper points out that despite steady tendencies of growth of all analyzed parameters, the level of informatisation in Russia is lower than in the developed countries.
- Igor Zatsman, Olga Kozhunova
Semantic vocabulary of the system of information monitoring in scientific sphere: The tasks and functions
Creation of systems of monitoring, analysis and performance and outcome evaluation of subjects' activity in socially significant spheres, including realm of science in Russia, motivated the work made. Within the given article an overview of existing semantic dictionaries, the definition of problems and functions of semantic dictionary of information monitoring system in the area of science is made.
- Igor Zatsman, Olga Kourchavova
Long-run information and communication technologies and terms for their description
The development of long-run technologies aimed at the forever yours documents which includes creation, storage, and use of digital medium elements is included into the list of top-priority information and communication technologies of the 7th Framework Programme of the European Union for 2007 - 2013. The problem of storing and using forever yours digital documents that capture knowledge representation forms of humans has become acute at the end of the 20th century and the beginning of the 21st century. At the time, multiple categories of forever yours documents, which are important in terms of law and legal practice, started to be used in non-paper form only without simultaneous creation of their paper replicas. The paper is a case study of non-paper patent application forms. A system of new terms to describe the technology of forever yours documents is proposed. The related requirements of the European Patent Office are analyzed.
- Igor Gurevich
On information models in cosmology
The article presents informational approach for studying the structure of the Universe containing maximal and minimal information volumes and for analysis of black holes structure. From the informational point of view the Universe combines the four types of mass (energy) that correspond to black holes, conventional substance matters, dark matter and dark energy. The author makes a statement about the existence of an optimal black holes mass, which minimizes information volume in the Universe and determines subsequent limitations on the possible volume of information in the Universe. An informational model of black hole is proposed.
By using informational methods it is possible to determine the black hole structure, mass of its particles, emission frequency (temperature). The author makes an assumption that black holes are the systems of pairwise interacting particles (cubites), i.e. the objects with the maximal concentration of information. An explanation of square-law dependence between information volume (entropy) of black hole and its mass is presented.
II Information technologies
- Natalia Markova
Technical communication in software development project
One of the most important components of software development project - technical communication is considered. The author points out that resources allocation, availability of well-trained personnel and methodological support in the area are not enough. It is suggested to pay more attention to the humanitarian aspects of communication.
- Vasily Dyachkov, Artem Popov, Dmitry Shulyatnikov
Problematics of distributed file systems
This article is devoted to fundamental problems, which originate from designing, implementation and using distributed file systems (DFS). There was accomplished comparative analysis of the most meaningful criteria of the most popular DFS having open design specifications. The work results in setting the main problems arising in the process of designing DFS that must be further studied and solved.
- Natalia Markova, Olga Obuhova, Ivan Soloviev, Anton Chochia
Effective facet navigation in digital collections
The paper considers a special kind of digital collections - collections of independent information objects that are defined by their attributes. The formal model of a collection navigation process as an interactive sequence of steps that constructs the facet formula is proposed. Several design decisions that can improve efficiency of navigation are discussed. A sample of visual interface snapshot illustrates the main ideas of the facet navigation.
- Boris Shmeilin
The memory references speedup via the application code transformation
Cache performance boosting can yield significant execution speedups, in particular when applied to numerically intensive codes. The article describes some application code transformations, which increase data locality, and decrease conflict cache misses. The author proposes new approach to code transformation for applications with indirect addressing.
- Igor Kuznetov, Nikolay Somin
English-Russian system for knowledge extraction from information streams in the Internet environment
The article describes linguistic and algorithmical aspects of the problem of knowledge extraction from the texts in the Internet environment. The means that improve the quality of linguistic processor operation and take into account a special nature of the documents available on the web and large volumes of the texts in English are proposed. It was the reason why additional means for identification of formal and meaningful attributes of the words in English were added to the morphological analysis component. The capabilities of subject catalogues to identify semantic categories of English words were enhanced. The contextual rules of syntactic-semantic analysis of standard forms of the English language were developed. The authors suggest the means for tuning the components for morphological and syntactic-semantic analysis to the language of imputed text (through subject catalogues).
- Igor Kuznetsov, Boris Rabinovich
Knowledge base model with a possibility of integrating external information sources in the Analytic system
The article is devoted to the analysis of methods that allow improving the work of the logical analytic system "Analytic" in case of information storing and processing. For this purpose it is suggested in the first place to use as storage of Knowledge Base the Data Base Management System (DBMS) "Oracle", which allows handling huge information volumes. And secondly, the article explains the methods of linking external databases to Analytic system with a goal of providing the user with more substantial information about the target object. The analysis presented in the article is based on the results of working with the database of the Moscow State Telephone Network (MGTS).
- Alexander Perekrestenko
Development and software implementation of a system for automatic highlighting syntactic groups in natural languages
The article describes creation of the two central components of syntactic processor: syntactic parser and unification module. Also, it presents analysis of the existing limited syntactic formalisms with regard to their fitness as a basic model of a system for automatic syntactic analysis. The parser works with formalism that allow along with other things to present discontinuous component and ellypsis. The unification module is designed for description and analysis of the functional structure of sentences as well as for morphological correlation.
A module for graphical presentation of the analysis results was embedded into the parser. Both the parser and unificator were implemented by using C++ programming language irrespective of any platforms.
- Irina Galina
Functional-synonimic ways of expressing aspectual and taxis values in French and Russian languages (for multilanguage linguistic processor)
The given paper is focused on the research of the basic language means expressing the aspectual and taxis values in the constructions of French and Russian languages. The principal language phenomena considered are participial and gerundial phrases and other types of constructions employing primary and secondary predication means. The phenomenon of phrasal functional synonymy is studied for the bilingual (French-Russian) situation. An illustration of possible transfer rules design for machine translation is given. The results of testing the performance of some leading commercial machine translation systems show the lack of adequate presentation and translation of aspectual - taxis values in the French-Russian and Russian-French translation.
- Nina Luneva
Architecture and metadata of multilingual linguistic knowledge base
The paper describes some principal architectural decisions and the ways of using metadata in the multilingual linguistic knowledge base founded on the new linguistic resource. The linguistic knowledge base is aimed at debugging semantic-syntactical representations in language processors of machine translation and text knowledge processing systems. The new knowledge base is being designed as a major test bed for the research community in the field of computational linguistics and intellectual technologies as well as for educational purposes, for comparative analysis of language structures and creating language training environments. The knowledge base features the component of the multilingual translation memory. Our approach concords with the modern tendencies in computer-aided translation (CAT) system development.
- Sergey Dulin, Stepan Duhin, Vladimir Popovidchenko
On multilevel geodata ontology
The authors discuss ontological status of the image received as a result of remote scanning or photographing. They make a statement about double nature of the images under study: these are the fields with continuous characteristics at a level of measurement and the objects at a level of classification. The images require their own ontological description, which must be different and independent of applied area ontology, which is used by experts in geoinformation systems. In the paper it is suggested to use multilevel ontology for images, by combining paradigms a field and an object and making distinctions between ontology for images and ontology for the user. On the basis of suggested structure the following two key factors can be realized: (1) supporting plural representations for one and the same image and (2) using images for detection of spatial-time configurations of geographical phenomena.
- Alexander Martynenko, Alexander Nikishin, Dmitry Nikishin
Integrated geographical information system, problems and strategy of its formation
The article is focused on the basic problems arising during integration of various spatial data in modern geoinformation systems (GIS), which might be interesting for a broad user audience. Principal causes of occurrence of problems of the integration connected to specificity of cartographical filling of databases traditional GIS are analyzed. Solution of these problems is seen by the authors in transition to multilevel, universal, uniform system of geoinformation that integrates various thematic data of different scales. The concept of integrated GIS formation is offered on the basis of the detailed cartographical data. The authors describe the possibility of its practical realization by the example of integrating local GISs created for the local regulatory bodies into uniform GIS of federal or regional level.
III Architecture and system solutions for creating computation complexes and networks of new generation
- Adolf Filin
Supercomputers and supercomputing: The status of parallel calculations problem
The results of achievement in international sphere of ultrahigh-speed calculations (supercomputing) as of the period till June 2006 are described. It is shown that in a rigid competition between supercomputer architectures the victory is gained by a class of cluster supercomputers based on the concept of parallelism known as "a model of cellular automatic devices". The intentions and perspective plans of supercomputer systems manufacturers in USA, Japan, Great Britain, China and other countries are considered. The generalizing conclusions describing the current status and prospects of ultra-high speed calculations sphere are formulated.
- Adolf Filin
Supercomputing and classical Computers
The analysis of the status of ultra-high speed calculations sphere that was presented in the previous article is continued here. It is shown that the key manufacturers of common purpose VLSI-microprocessors (Intel, AMD, IBM etc.) started implementing multinuclear concept in the processor construction. By this event they have designated that further increase of uniprocessor computers productivity has exhausted itself economically and the time of computers on the basis of multinuclear processors has come. As one of the basic areas of multinuclear processors application the sector of supercomputers and supercalculations is considered. No other alternative decisions offered by the leading microprocessor manufacturers have been detected.
- Leonid Plekhanov
Self-timing and the tasks of pure self-timed electronic circuits analysis
Conception of self-timing based on signals indication introduced by V.I. Varshavsky did not have any further noticeable practical development. But the needs of designing large (not limited by size) pure self-timed circuits (PST-circuits) analysis of which is impossible by existing methods attract again an attention to this conception.
Possible use of pure functional conception of self-timing based on indication and non-connected with event-responsive models is done in this article for large PST-circuits. The author presents the analysis tasks that are different from the ones that are currently used.
- Yuri Stepchenkov, Yuri Djachenko, Vladimir Petrukhin
Self-timed sequential circuits: Development experience and design guideline
Self-timed (ST) circuits actively go over from theoretical research field into the area of practical projects finding their implementations in the wide assortment of computing devices. This is due to such features of the ST-circuits as independence of working capacity on delay of device components, natural reliability, working capacity in significantly wider range of varied environmental factors and power supply voltage. This paper describes the guidelines on designing sequential ST-circuits implemented on CMOS (complementary-metal-oxide-semiconductor) technology base. The comparative analysis of the characteristics of sequential synchronic and ST-circuits obtained by means of simulation and practical experiments of the test chips is represented. Test results prove that usage of ST-circuitry provides an improvement of the characteristics of sequential circuits, especially for their fault-tolerant implementations.