Informatics and Applications

2015, Volume 9, Issue 2, pp 63-74


  • M. P. Krivenko


The article considers the problem of modeling reference values - results of a certain type of quantities obtained from a single individual or a group of individuals corresponding to a stated description. For this purpose, the article proposes to use a mixture of normal distributions, which can effectively serve as a means of approximating the actual data and to be accessible from the standpoint of theoretical analysis. In estimating the parameters of mixture of distributions, the major role is played by the maximum likelihood method and its embodiment in the form of the expectation-maximization (EM) algorithm. For assessing the number of mixture components, the article suggests to use the likelihood ratio test and a method based on the chi-square distance between the distributions. Their properties are investigated using the bootstrap method. As an experiment, the article considers the description of the empirical distribution of patient data, including the age and measurements of PSA (Prostate-Specific Antigen). The proposed solutions have clear advantages: high detail by age, smoothing the results of observations for age groups which are different in size, and the opportunity to form assumptions about the nature of the relationship between age and PSA.

[+] References (14)

[+] About this article