Services on Demand
Journal
Article
Indicators
- Cited by SciELO
Related links
- Similars in SciELO
Share
SaberEs
Print version ISSN 1852-4418On-line version ISSN 1852-4222
Abstract
RIANO, María Eugenia. Missing data imputation using spatial statistics techniques applied to Uruguay census of population and housing. SaberEs [online]. 2019, vol.11, n.2, pp.153-169. ISSN 1852-4418.
Uruguay National Census was quality and coverage positively evaluated in general, attaining international standard requirements. However, the data collecting process had some difficulties. The omission are concentrated in segments socioeconomically vulnerable. This could have an impact over the algorithm performed by the government to select the beneficiary population of cash-transfer programs. The heterogeneous spatial pattern of the target population and of the omission itself makes necessary define regions for the imputation of the missing data. Regions are obtained by means of spatial oblique decision trees. Spatial Autorregresive models are adjusted for each region. The models are assessed using cross-validation methods. Results are compared with the performance of a global model for the whole map. Except by one region, models that minimize cross-validation's errors show a similar lag in each region. The cross-validation error for the global model is quite similar. Nevertheless, spatial autocorrelation is detected according to the Moran test for residuals. Hence, the data imputation is performed by regions, with local SAR models, selecting the lag according to the cross-validation error. Results show that target population is underestimated approximately by a 5% over the total obtained with census data.
Keywords : Classification and Regression trees; Cross-validation; SAR models.