Document Type

Dissertation - Open Access

Award Date


Degree Name

Doctor of Philosophy (PhD)

Department / School

Agronomy, Horticulture, and Plant Science

First Advisor

Sunish Sehgal


Global wheat production needs to be increased by 60% to meet the future demand of feeding nine billion people by 2050. Simultaneously, it is important to improve the enduse quality to meet the requirements of producers, grain markets, processors, and consumers. Thus, the development of more productive wheat varieties with better enduse quality remains the primary focus for all wheat breeding programs. However, direct phenotypic selection for improving grain yield and end-use quality is difficult as it is highly influenced by environmental factors. This dissertation focuses on harnessing advancements in genomics applications, including genome-wide association studies (GWAS), for the genetic characterization of yield component traits and utilizing it in marker-assisted selection for grain yield. Further, we investigated the efficacy of genomic selection GS and assessed the performance of various statistical models in predicting agronomic and end-use quality traits in the South Dakota hard winter wheat (HWW) breeding program. In the first study, GWAS was used to identify genetic determinants for yieldcomponent traits in HWW, which exhibits higher heritability compared to grain yield per se. We assembled a population of breeding lines and well-adapted cultivars, genotyped using genotyping-by-sequencing (GBS), and evaluated over four environments for phenotypic analysis of spike and kernel traits. GWAS using 8,030 single nucleotide polymorphisms (SNPs) identified 17 significant and multi-environment marker-trait associations (MTAs) for various traits, representing 12 putative quantitative trait loci (QTLs), with five QTLs affecting multiple traits. Further, a highly significant QTL was detected on chromosome 7AS that has not been previously associated with the number of spikelets/spike and putative candidate genes were identified in this region. The allelic frequencies of important QTLs were deduced in a larger set of 1,124 accessions which revealed the importance of identified MTAs in the U.S. HWW breeding programs. In the second strategy, we studied to evaluate the potential of genomic selection in predicting complex traits at earlier stages of the breeding program. Here, we used multi-trait genomic prediction (GP) models to predict multiple agronomic traits using 314 advanced and elite breeding lines of HWW evaluated at ten site-year environments. Extensive data from multi-environment trials was used to cross-validate the multivariate machine learning (ML) models that integrate the analysis of multiple traits and/or include GxE interaction. The multivariate ML models performed better for all traits, with average improvement over the ST-CV1 reaching up to 19%, 71%, 17%, 48%, and 51% for grain yield, grain protein content, test weight, plant height, and days to heading, respectively. Next, we evaluated the efficacy of multivariate GP using a set of advanced breeding lines from 2015-2021 to predict various end-use quality traits that are otherwise difficult to phenotype in earlier generations. The multivariate GP model outperformed the univariate model with up to a two-fold increase in prediction accuracy (PA). For instance, PA was improved from 0.38 to 0.75 for bake absorption and from 0.32 to 0.52 for loaf volume. Further, we compared multi-trait GP models by including different combinations of easyto- score traits as model covariates to predict end-use quality traits and observed that the incorporation of simple traits such as flour protein and flour sedimentation weight value can substantially improve the PA for baking traits. Overall, the findings of these studies elucidate the potential of multivariate GP for agronomic traits when advanced breeding lines are used as training population to predict preliminary breeding lines. The results also showed the application of multivariate GP models in the breeding program can reduce phenotyping costs by facilitating a sparse testing design. Furthermore, we observed that the inclusion of rapid low-cost traits like flour protein and flour sedimentation weight value in MT genomic prediction models can facilitate the use of GS to predict baking traits in earlier generations and provide breeders an opportunity for selection on end-use quality traits by culling inferior lines to increase selection accuracy and genetic gains.


South Dakota State University



Rights Statement

In Copyright