Document Type

Dissertation - Open Access

Award Date

2024

Degree Name

Doctor of Philosophy (PhD)

Department / School

Electrical Engineering and Computer Science

First Advisor

Sung Shin

Abstract

This dissertation comprises two main sections: the first focuses on natural language processing (NLP) for extracting key information from scientific literature using large language models (LLMs), and the second addresses remote sensing for detecting natural disasters, such as floods, from satellite imagery using a multimodal approach. The first section investigates methods to enhance Transformer-based models in classifying and extracting information from biomedical scientific publications. Key contributions include the development of a custom dataset for classification and Question and Answering (Q&A) tasks, fine-tuning Transformer models like the Bidirectional Encoder Representations from Transformers (BERT) and addressing multi-span answer issues with the TAg-based Span Extraction (TASE) model. Additionally, this explores improving classification performance by integrating citation practices into a graph model combined with existing LLMs. It also proposes FulltextAttention, a Recurrent Neural Network (RNN)-based hierarchical model, to overcome the limitations of the Transformer model’s self-attention mechanism and input sequence length constraints. The second section explores flood detection in satellite imagery through classification and semantic segmentation. Higher classification performance was achieved using a multimodal approach, combining Convolutional Neural Network (CNN) models with BERT or knowledge-graph-based Graph Convolutional Network (GCN) models. For semantic segmentation, this aims to detect flooded areas at the pixel level, proposing various multimodal concepts to improve performance over existing U-Netbased CNN models. A custom dataset was developed due to the lack of an existing benchmark, with detailed dataset development and performance analysis included. This dissertation highlights performance improvement studies on existing LLM models, dataset development, and multimodal approaches for flood detection in satellite images. It also discusses limitations and future research directions. For LLM research, dominated by large tech companies, alternative approaches like RNNs may enhance performance. For multimodal flood detection, future efforts should focus on models that maximize extraction of visual, spatial, and temporal features from satellite images, and fine-tuning the Contrastive Language-Image Pre-Training (CLIP) model for improved zero-shot classification.

Publisher

South Dakota State University

Share

COinS
 

Rights Statement

In Copyright