Informatics and Applications

2017, Volume 11, Issue 3, pp 27-33


  • M. P. Krivenko


The article examines the effectiveness of classification methods for incomplete clinical data. Training Bayesian classifier is carried out by the maximum likelihood method for the model of a mixture of normal distributions. Rigorous derivation of formulas ensuring the realization of the steps of the EM algorithm allowed correctly applying the iterative process of obtaining estimates of the parameters of the mixture. For incomplete data, methods for selecting initial values and correcting degenerate covariance matrices for the elements of the mixture are proposed. The experimental part of the work consisted in analyzing the dependence of the quality of classification on the number of missing individual values, using data on enzymes obtained for patients with liver diseases. The real data treatment has demonstrated almost identical classification errors when applying simple and complex methods of processing of missing values in the case of low number of randomly missing individual values.

[+] References (16)

[+] About this article