Document Type
Thesis - Open Access
Award Date
2025
Degree Name
Master of Science (MS)
Department / School
Electrical Engineering and Computer Science
First Advisor
Larry Leigh
Abstract
Efficient clustering of high-dimensional satellite image datasets remains a critical challenge, particularly due to the computational demands of spectral distance calculations, random centroid initialization, and sensitivity to outliers in conventional K-Mean algorithms. This study presents a comprehensive comparative analysis of eight parallelized variants of the K-means algorithm, designed to enhance clustering efficiency and reduce computational burden for large-scale satellite image analysis. The proposed parallelized implementations incorporate optimized centroid initialization for better starting point selection, a Dynamic K-mean sharp method to detect the outlier to improve cluster robustness, and a Nearest-Neighbor Iteration Calculation Reduction method to minimize redundant computations. These enhancements were applied to a test set of 114 global land cover data cubes, each comprising high-dimensional satellite images of size 3712*3712*16, and executed on multi-core CPU architecture to leverage extensive parallel processing capabilities. Performance was evaluated across three criteria: convergence speed (iterations), computational efficiency (execution time), and clustering accuracy (RMSE). The Parallelized Enhanced K-Mean (PEKM) method achieved the fastest convergence at 234 iterations and the lowest execution time of 4230 hours, while maintaining consistent RMSE values (0.0136) across all algorithm variants. These results demonstrate that targeted algorithmic optimizations, combined with effective parallelization strategies, can improve the practicality of K-means clustering for high dimensional satellites image analysis. This work underscores the potential of improving K-means clustering frameworks beyond hardware acceleration alone, offering scalable solutions good for large-scale unsupervised image classification tasks.
Library of Congress Subject Headings
Cluster analysis.
Image processing.
Landsat satellites.
Remote sensing -- Data processing.
Publisher
South Dakota State University
Recommended Citation
Pant, Yuv Raj, "Improving K-Mean Clustering: A Comparative Study of Parallelized Version of Modified K-Mean Algorithm for Clustering of Satellite Images" (2025). Electronic Theses and Dissertations. 1541.
https://openprairie.sdstate.edu/etd2/1541