Document Type

Thesis - University Access Only

Award Date

2012

Degree Name

Master of Science (MS)

Department / School

Mathematics and Statistics

First Advisor

Xijin Ge

Abstract

The model plant Arabidopsis has been well-studied using high-throughput genomics and proteomics technologies, generating massive gene expression data and numerous lists of differentially expressed genes under different treatments or conditions. This study attempts to collect and analyze these gene lists to discover hidden links among them. A total of 1,065 gene lists were manually collected from 519 published gene expression studies of Arabidopsis based on information from the National Center for Biotechnology Information (NCBI). From these gene lists, the researchers identified 16,261 statistically significant overlaps. These significant overlaps were represented by an undirected network in which nodes correspond to gene lists and edges correspond to significant overlaps between the two gene lists. The network highlighted the correlation across the gene expression signatures of the diverse biological processes. The researchers also were able to partition the main network into 20 sub-networks, representing groups of highly similar gene lists. Of them, 9 sub-networks/modules were examined and analyzed. Consequently, 752 most :frequently shared genes were identified from the 9 sub-networks. There are hidden links among the gene lists based on the results of this study. These links are common sets of genes that were regulated under different treatments or conditions and were related to different biological themes. Compared to previous reports focusing on specific topics, this study explored and established the hidden links among all the gene lists on a global scale. This study provides new clues or hypothesis on the relationship between the diverse cellular processes for future research.

Publisher

South Dakota State University

Share

COinS
 

Rights Statement

In Copyright