SciELO - Scientific Electronic Library Online

 
 issue61Labor market insertion and health care access in the period 2004-2020. A contribution to the study of the labor force’s reproduction conditions in Argentina author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

  • Have no cited articlesCited by SciELO

Related links

  • Have no similar articlesSimilars in SciELO

Share


Estudios del trabajo

Print version ISSN 0327-5744On-line version ISSN 2545-7756

Abstract

ROSATI, Germán. Machine Learning as alternative methods for missing data imputation. An exercise using Permanent Household Survey. Estud. trab. [online]. 2021, n.61, pp.122-145. ISSN 0327-5744.

This paper presents some advances in the construction of a model for the imputation of missing values ​​and no response for the income variables in household surveys. The results of some imputation experiments of the labor income variable of the Permanent Household Survey are presented, based on Assembly Learning and Deep Learning techniques: Random Forest, XGBoost and Multi-Layer Perceptron. The performance of these techniques is compared with the Hot Deck method (one of the methods used by the National Statistical System). In the first and second part of the document, it raises the problem more specifically and reviews the main mechanisms for generating lost values ​​and their consequences at the time of imputation of lost values. In the third part, the proposed techniques and their theoretical-methodological foundations are presented. Finally, in the fourth section, the main results of the application of the proposed methods on data from the Permanent Household Survey are presented.

Keywords : Machine Learning; Missing data; Imputation; Survey.

        · abstract in Spanish     · text in Spanish     · Spanish ( pdf )