Off-campus South Dakota State University users: To download campus access theses, please use the following link to log into our proxy server with your South Dakota State University ID and password.

Non-South Dakota State University users: Please talk to your librarian about requesting this thesis through interlibrary loan.

Document Type

Thesis - University Access Only

Award Date


Degree Name

Master of Science (MS)

Department / School

Mathematics and Statistics

First Advisor

Jose L. Gonzalez-Hernandez


Plants deploy appropriate biological processes to successfully overcome adverse environments throughout their life cycle. Discovering and understanding such biological mechanisms and processes are crucial to develop better crop varieties through breeding. Determining changes in global gene expression patterns is an efficient way to achieve this goal. RNA-Seq, a technique used to assay global gene expression, generates very large datasets of nucleotide sequences. Biological interpretation of such datasets is impossible without multiple computational analysis steps including preprocessing for quality control, assembling the sequences in transcripts, annotating their putative biological function, statistical assays to determine and compare gene expression values, and the development of databases to effectively disseminate all of this information. This study used two RNA-Seq datasets as case studies to develop a bioinformatics pipeline for all the above steps. The first dataset was obtained from an experiment to determine global gene expression changes in response to nanoparticle exposure in spinach. CLC Genomics Workbench assisted assembly and Blast2GO assisted annotation of this data provided the largest reference transcriptome for spinach. Subsequent determination of global gene expression changes and overlaying this information onto biological pathways using MapMan identified that genes associated with biotic stress responses, jasmonate synthesis, protein degradation, redox, sulphate assimilation, and cell wall precursor synthesis were up-regulated in response to zinc oxide (ZnO) nanoparticle exposure. We concluded that nanoparticle exposure elicited responses similar to that against necrotrophic pathogens that cause cell death. The next dataset was from an experiment aimed at determining changes in global gene expression patterns in oats in response to the crown rust pathogen. This dataset had already been analyzed for differential gene expression. The transcripts were annotated for biological function using Blast2GO, where gene expression was overlaid onto metabolic pathways in Cytoscape for effective visualization. A database architecture and framework were developed in Microsoft SQL Server 2008 and MySQL; Combined with a web form that enables users to search this database using key words or gene expression parameters, this enabled effective dissemination of this dataset. Together, the bioinformatics pipeline will take RNA-Seq data through the analysis, visualization and dissemination thus accelerating biological discoveries.

Library of Congress Subject Headings

RNA -- Analysis
Plant gene expression


Includes bibliographical references (page 80)



Number of Pages



South Dakota State University


In Copyright - Non-Commercial Use Permitted