Session 4 - Advances in Probabilistic Modeling for Machine Learning: Semiparametric Imputation using Conditional Gaussian Mixture Models

Presenter Information/ Coauthors Information

Danhyang Lee, University of Alabama - Tuscaloosa

Presentation Type

Invited

Abstract

Imputation is a popular technique for handling item nonresponse often found in data application. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust imputation based on a flexible model where the number of model parameters can increase with the sample size. In this talk, we propose another semiparametric imputation based on a more flexible model assumption than the Gaussian mixture model. In the proposed mixture model, we assume a conditional Gaussian model for the study variable of interest given the auxiliary variables, but the marginal distribution of the auxiliary variables is not necessarily Gaussian. The proposed method is applicable to high dimensional covariate problem by including a penalty function in the conditional log-likelihood function. The proposed method is applied to 2017 Korean Household Income and Expenditure Survey conducted by Statistics Korea.

Start Date

2-11-2020 9:30 AM

End Date

2-11-2020 10:30 AM

This document is currently not available here.

Share

COinS
 
Feb 11th, 9:30 AM Feb 11th, 10:30 AM

Session 4 - Advances in Probabilistic Modeling for Machine Learning: Semiparametric Imputation using Conditional Gaussian Mixture Models

Campanile & Hobo Day Gallery (A & B)

Imputation is a popular technique for handling item nonresponse often found in data application. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust imputation based on a flexible model where the number of model parameters can increase with the sample size. In this talk, we propose another semiparametric imputation based on a more flexible model assumption than the Gaussian mixture model. In the proposed mixture model, we assume a conditional Gaussian model for the study variable of interest given the auxiliary variables, but the marginal distribution of the auxiliary variables is not necessarily Gaussian. The proposed method is applicable to high dimensional covariate problem by including a penalty function in the conditional log-likelihood function. The proposed method is applied to 2017 Korean Household Income and Expenditure Survey conducted by Statistics Korea.