Comparative study of two imputation methods in R package for RGB color histogram data
DOI:
https://doi.org/10.53840/myjict4-1-59Keywords:
Missing data, Multiple imputationAbstract
Multiple imputation (MI) is a powerful tool in handling missing data issue. This paper provides a comparison of the multiple imputation method in Amelia II package and MICE package in R. Both packages are well-known and incredible to conduct the missing data research in numerous domains. There are very limited researches comparing the multiple imputation combined with other techniques in the image data context. We employ the mean absolute error (MAE) error metric to evaluate the accuracy on the predicted values based on 20% and 50% of missing data percentages. Although the implementation of MICE is time consuming, the result shows that MICE can deal with large amount of missing values while Amelia II is only capable to deal up 20% amount of missing values. Based on the MAE result, both packages show that they are superior on the particular variables.
Downloads
References
J. L. Schafer and J. W. Graham, “Missing data: our view of the state of the art,” Psychol. Methods, vol. 7, no. 2, pp. 147–177, 2002.
A. Briggs, T. Clark, J. Wolstenholme, and P. Clarke, “Missing presumed at random : cost-analysis of
incomplete data,” Health Econ., vol. 12, no. May, pp. 377–392, 2008.
D. Lee, J. Kim, W.-J. Moon, and J. C. Ye, “CollaGAN : Collaborative GAN for Missing Image Data Imputation,” in IEEE Conference on Computer Vision and Pattern Recognition, 2019.
D. B. Rubin, “Multiple imputation in sample surveys - A phenomenological bayesian approach to nonresponse,” in Proceedings of the Survey Research Methods Section of the American Statistical Association, 1978, pp. 20–28.
D. B. Rubin, “An overview of multiple imputation,” in Proceedings of the Section on Survey Research Methods, 1988, pp. 79–84.
S. Van Buuren, Flexible Imputation of Missing Data. Chapman & Hall/CRC, Boca Raton, FL, 2012.
J. Honaker, G. King, and M. Blackwell, “AMELIA II : A Program for Missing Data,” J. Stat. Softw., vol. 45, pp. 1–47, 2011.
S. van Buuren and K. Groothuis-Oudshoorn, “mice : Multivariate Imputation by Chained Equations in
R,” J. Stat. Softw., vol. 45, no. 3, 2011.
D. B. Rubin, “Fully conditional specification in multivariate imputation,” J. Stat. Comput. Simul., vol. 76, no. 12, pp. 1049–1064, 2006.

