Informatics and Applications
2021, Volume 15, Issue 4, pp 7986
STATISTICS AND CLUSTERS FOR DETECTION OF ANOMALOUS INSERTIONS IN BIG DATA ENVIRONMENT
 A. A. Grusho
 N. A. Grusho
 M. I. Zabezhailo
 D. V. Smirnov
 E. E. Timonina
 S. Ya. Shorgin
Abstract
The paper builds algorithms for reducing the level of "false alarms" when searching for anomalies in complex heterogeneous sequences of objects (Big Data). Traditionally, in mathematical statistics, such a decrease is achieved by minimizing the error of "false alarms." However, in the problems of detecting anomalies (rare intrusions of anomalous data), this approach leads to an increase in the probability of losing the required anomalies. In this paper, in order not to lose the required anomalies, on the contrary, in criteria designed for the least complexity of calculations, it is proposed to make a large error of the appearance of "false alarms" but use the fact that the number of objects allocated by such criteria is much smaller than the number of original objects in Big Data. The selected objects can then be grouped into a single cluster and additional information related to the objects in the cluster can be used to identify the required anomalies. The sense of these actions is that more difficulttocompute characteristics of objects for dropping out "false alarms" will not require large computational resources on a smaller cluster of objects relative to the original data. It is shown that when certain conditions are satisfied, the order of using additional information does not affect the result of its use when filtering "false alarms." The results of the filtering algorithm in the sequence of objects are generalized to filtering "false alarms" in the form of causal schemes in the initial data. Known schemes show how "false alarms" can be filtered identifying only fragments of schemes.
[+] References (15)
 Axelsson, S. 2002. Intrusion detection systems: A survey and taxonomy. Available at: http://www.cse.msu. edu/~cse960/Papers/security/axelsson00intrusion.pdf (accessed November 15, 2021).
 Grusho, A., M. Levykin, E. Timonina, V. Piskovski, and A. Timonina. 2015. Architecture of consecutive identification of attack to information resources. 7th Congress (International) on Ultra Modern Telecommunications and Control Systems Proceedings. Piscataway, NJ: IEEE. 265268. doi: 10.1109/ICUMT.2015.7382440.
 Grusho, A. A., M. I. Zabezhailo, A. A. Zatsarinny, and E. E. Timonina. 2018. O nekotorykh vozmozhnostyakh upravleniya resursami pri organizatsii proaktivnogo protivodeystviya komp'yuternym atakam [On some possibilities of resource management for organizing active counteraction to computer attacks]. Informatika i ee Primeneniya  Inform. Appl. 12(1):6270.
 Grusho, A. A., M. I. Zabezhailo, D.V. Smirnov, and E. E. Timonina. 2017. Model' mnozhestva informatsionnykh prostranstv v zadache poiska insaydera [The model of the set of information spaces in the problem of insider
detection]. Informatika i ee Primeneniya  Inform. Appl. 11(4):6569.
 Grusho, A. A., M. I. Zabezhailo, D.V. Smirnov, E. E. Timonina, and S. Ya. Shorgin. 2020. Metody matematicheskoy statistiki v zadache poiska insaydera [Mathematical statistics in task of identifying hostile insiders]. Informatika i ee Primeneniya  Inform. Appl. 14(3):7175. doi: 10.14357/19922264200310.
 Grusho, N. A., A. A. Grusho, and E. E. Timonina. 2020. Lokalizatsiya sboev s pomoshch'yu metadannykh [Localizing failures with metadata]. Problemy informatsionnoy bezopasnosti. Komp'yuternye sistemy [Problems of Information Security. Computer Systems] 3:915.
 Grusho, A. A., N.A. Grusho, M. I. Zabezhailo, and E. E. Timonina. 2020. Lokalizatsiya iskhodnoy prichiny anomalii [Root cause anomaly localization]. Problemy informatsionnoy bezopasnosti. Komp'yuternye sistemy [Problems of Information Security. Computer Systems] 4:916.
 Vaughan, G. 2018. Efficient big data model selection with applications to fraud detection. Int. J. Forecasting 36(3):11161127.
 Grusho, A., N. Grusho, and E. Timonina. 2019. The bans in finite probability spaces and the problem of small samples. Distributed computer and communication networks. Eds. V. M. Vishnevskiy, K. E. Samouylov, and D. V. Kozyrev. Lecture notes in computer science ser. Springer. 11965:578590. doi: 10.1007/9783030 366148_44.
 Axelsson, S. 2000. The baserate fallacy and its implications for the difficulty of intrusion detection. ACM T. Inform. Syst. Se. 3(3):186205.
 Grusho, A., N. Grusho, and E. Timonina. 2020. Method of several information spaces for identification of anomalies. Intelligent distributed computing XIII. Eds. I. Kotenko, C. Badica, V. Desnitsky, D. El Baz, and M. Ivanovic. Studies in computational intelligence ser. Springer. 868:515520. doi: 10.1007/9783030322588_60.
 Bank, M., A. Sengupta, and C. Mazumdar. 2016. Attack graph generation and analysis techniques. Defence Sci. J. 66(6):559567. doi: 10.14429/dsj.66.10795.
 Grusho, A., N. Grusho, and E. Timonina. 2016. Detection of anomalies in nonnumerical data. 8th Congress (International) on Ultra Modern Telecommunications and Control Systems and Workshops Proceedings. Piscataway, NJ: IEEE. 273276. doi: 10.1109/ICUMT.2016.7765370.
 Smirnov, D.V., A. A. Grusho, M.I. Zabezhailo, and E. E. Timonina. 2021. System for collecting and analyzing information from various sources in Big Data conditions. Int. J. Open Information Technologies 9(4):6471.
 Wielandt, H. 1964. Finite permutation groups. New York/ London: Academic Press. 114 p.
[+] About this article
Title
STATISTICS AND CLUSTERS FOR DETECTION OF ANOMALOUS INSERTIONS IN BIG DATA ENVIRONMENT
Journal
Informatics and Applications
2021, Volume 15, Issue 4, pp 7986
Cover Date
20211230
DOI
10.14357/19922264210411
Print ISSN
19922264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
information security; search for anomalies; algorithms for filtering "false alarms"
Authors
A. A. Grusho , N. A. Grusho , M. I. Zabezhailo , D. V. Smirnov , E. E. Timonina , and S. Ya. Shorgin
Author Affiliations
Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 442 Vavilov Str., Moscow 119333, Russian Federation
Sberbank of Russia, 19 Vavilov Str., Moscow 117999, Russian Federation
