Document Type
Dissertation - Open Access
Award Date
2024
Degree Name
Doctor of Philosophy (PhD)
Department / School
Electrical Engineering and Computer Science
First Advisor
Sung Shin
Abstract
This dissertation comprises two main sections: the first focuses on natural language processing (NLP) for extracting key information from scientific literature using large language models (LLMs), and the second addresses remote sensing for detecting natural disasters, such as floods, from satellite imagery using a multimodal approach. The first section investigates methods to enhance Transformer-based models in classifying and extracting information from biomedical scientific publications. Key contributions include the development of a custom dataset for classification and Question and Answering (Q&A) tasks, fine-tuning Transformer models like the Bidirectional Encoder Representations from Transformers (BERT) and addressing multi-span answer issues with the TAg-based Span Extraction (TASE) model. Additionally, this explores improving classification performance by integrating citation practices into a graph model combined with existing LLMs. It also proposes FulltextAttention, a Recurrent Neural Network (RNN)-based hierarchical model, to overcome the limitations of the Transformer model’s self-attention mechanism and input sequence length constraints. The second section explores flood detection in satellite imagery through classification and semantic segmentation. Higher classification performance was achieved using a multimodal approach, combining Convolutional Neural Network (CNN) models with BERT or knowledge-graph-based Graph Convolutional Network (GCN) models. For semantic segmentation, this aims to detect flooded areas at the pixel level, proposing various multimodal concepts to improve performance over existing U-Netbased CNN models. A custom dataset was developed due to the lack of an existing benchmark, with detailed dataset development and performance analysis included. This dissertation highlights performance improvement studies on existing LLM models, dataset development, and multimodal approaches for flood detection in satellite images. It also discusses limitations and future research directions. For LLM research, dominated by large tech companies, alternative approaches like RNNs may enhance performance. For multimodal flood detection, future efforts should focus on models that maximize extraction of visual, spatial, and temporal features from satellite images, and fine-tuning the Contrastive Language-Image Pre-Training (CLIP) model for improved zero-shot classification.
Library of Congress Subject Headings
Natural language processing (Computer science)
Scientific literature.
Artificial intelligence.
Natural resources -- Remote sensing.
Floods.
Remote-sensing images -- Data processing.
Publisher
South Dakota State University
Recommended Citation
Jang, Youngsun, "Optimizing Large Language Models and Multimodal Approaches for Biomedical Publication and Satellite Imagery" (2024). Electronic Theses and Dissertations. 1156.
https://openprairie.sdstate.edu/etd2/1156