Systems and Means of Informatics

2023, Volume 33, Issue 3, pp 149-160

DATA CLEANSING IN THE TECHNOLOGY OF CONCRETE HISTORICAL INVESTIGATION SUPPORT

  • I. M. Adamovich
  • O. I. Volkov

Abstract

The article continues the series of works devoted to the technology of concrete historical research supporting. The technology is based on the principles of co-creation and crowdsourcing and is designed for a wide range of users which are not professional historians and biographers. The expediency of expanding the list of concrete historical research tasks solved within the framework of the described technology using machine learning methods is shown.
The special importance of data preparation is noted due to the fragmentation and inconsistency of concrete historical information. This article is devoted to the specifics of concrete historical data cleansing and the analysis of the possibility of using mechanisms and algorithms already integrated into the technology for this purpose. The main directions in which data cleansing is carried out are listed. Suitable tools already included in the technology have been identified for each direction. Particular attention is paid to tools for eliminating inconsistencies. The stages of data cleansing are listed and the scheme of interaction of all mechanisms and algorithms described in the article is given.

[+] References (19)

[+] About this article