Document Type

Thesis - Open Access

Award Date

2026

Degree Name

Master of Science (MS)

Department / School

Electrical Engineering and Computer Science

First Advisor

Kwanghee Lee

Abstract

Procedural Content Generation (PCG) systems produce vast quantities of game levels, terrain, and environments, but lack semantic interfaces: no shared vocabulary exists between natural language, designer intent, and the structured representations generators operate on. This thesis addresses the language-content grounding gap in PCG through two complementary studies spanning semantic analysis and semantic synthesis. The first study introduces a group-supervised contrastive learning framework for semantic representation of symbolic PCG maps under many-to-one semantics, where visually distinct maps may share the same design intent. The framework combines parameter-guided semantic grouping, LLM-based caption augmentation, and a multi-positive contrastive objective that aligns language with sets of maps generated under shared control parameters. Evaluated on a Zelda-like grid environment against visual and statistical baselines, the model achieves 60.8% Semantic Recall@5 on in-distribution maps (versus 25.7% for the best baseline) and supports zero-shot inference of abstract gameplay attributes with 69.8% mean accuracy. The second study introduces T2TM (Text-to-Terrain in Minecraft), a five-stage knowledge distillation pipeline for generating natural terrain from free-form text prompts. Its core idea is a 16 by16 character grid scaffold that encodes spatial layout in a text-compatible form, allowing compact Qwen3.5 student models (4B and 2B parameters) to learn schema structure and spatial reasoning conventions from a GPT-5.4 teacher over a 1502-sample corpus (1051/225/226 train/validation/test). A seven-condition ablation on the 226-sample test set reveals a capacity-dependent scaffolding effect: the grid degrades zero-shot performance for the stronger 4B model (46.0%→19.5%) but improves it for the weaker 2B model (8.4%→71.2%) before fine-tuning. After distillation, both students achieve 100% build success with near-zero repair cost (0.027 for 4B and 0.018 for 2B fixes per sample). The 4B student matches the teacher on grid adherence and improves prompt satisfaction, while the 2B student remains competitive but loses some spatial precision and zone diversity. Together, these studies support a central thesis: bridging language and procedural game content requires progress in both semantic understanding and semantic generation. Reusable language-aligned representations are needed to analyze procedural artifacts, and structured generation pipelines are needed to translate language into spatially coherent content.

Publisher

South Dakota State University

Share

COinS
 

Rights Statement

In Copyright