Serviços Personalizados
Journal
Artigo
Indicadores
Citado por SciELO
Links relacionados
Similares em SciELO
Compartilhar
Estudios del trabajo
versão impressa ISSN 0327-5744versão On-line ISSN 2545-7756
Resumo
ROSATI, Germán. Machine Learning as alternative methods for missing data imputation. An exercise using Permanent Household Survey. Estud. trab. [online]. 2021, n.61, pp.122-145. ISSN 0327-5744.
This paper presents some advances in the construction of a model for the imputation of missing values and no response for the income variables in household surveys. The results of some imputation experiments of the labor income variable of the Permanent Household Survey are presented, based on Assembly Learning and Deep Learning techniques: Random Forest, XGBoost and Multi-Layer Perceptron. The performance of these techniques is compared with the Hot Deck method (one of the methods used by the National Statistical System). In the first and second part of the document, it raises the problem more specifically and reviews the main mechanisms for generating lost values and their consequences at the time of imputation of lost values. In the third part, the proposed techniques and their theoretical-methodological foundations are presented. Finally, in the fourth section, the main results of the application of the proposed methods on data from the Permanent Household Survey are presented.
Palavras-chave : Machine Learning; Missing data; Imputation; Survey.