Document Type

Thesis - Open Access

Award Date


Degree Name

Doctor of Philosophy (PhD)

Department / School

Mathematics and Statistics

First Advisor

Semhar Michael


BIC, Complex Survey Design, Finite Mixture of Regression Models, pseudo-likelihood, Sampling Weights, The expectation-maximization algorithm


Over time, survey data has become an essential source of information for modern society. However, to be effective, the structures of survey data require sampling designs that are more complex than simple random sampling. The complex sampling data collected from enormous national surveys via these complex designs ideally include sample weights that allow analysis to take account of complicated population structures. When the target of inference is the parameters of a regression model, it is crucial to know whether these weights should be incorporated into the sampling weight when fitting the model to the survey data. The finite mixture models are one tool for modeling heterogeneity and finding the subgroups in the data. Limited literature is available on modeling survey data via the finite mixture of regression models using a complex survey design. The principal aim of this dissertation is to develop and evaluate strategies for survey data modeling using a new design-based inference, where sampling weights are integrated into the complete-data log-likelihood function. More specifically, the pseudo maximum likelihood estimator (PML) has been considered, so the expectation-maximization (EM) algorithm was developed accordingly. In order to evaluate this strategy in realistic circumstances, we simulated the performance of the proposed model under numerous scenarios. Comparisons were made using bias-variance components of the mean squared error. Additionally, the Bayesian information criterion was utilized and assessed as a selection tool under the proposed modeling approach. Finally, we applied the proposed approach to original survey datasets to assess its practical usefulness

Library of Congress Subject Headings

Regression analysis -- Data processing.
Sampling (Statistics)
Expectation-maximization algorithms.



Number of Pages



South Dakota State University



Rights Statement

In Copyright