Open PRAIRIE: Open Public Research Access Institutional Repository and Information Exchange - SDSU Data Science Symposium: Generative AI for Synthetic Data Creation: Building Mastery-Focused Educational Datasets
 

Presentation Type

Poster

Student

Yes

Track

Other

Abstract

Synthetic data is artificially generated data that mimics the statistical properties of real world data without exposing sensitive information. It is used in analysis, research, and deployments. Educational technology (EdTech) is an area where synthetic data can solve the problems of data scarcity, privacy concerns, regulatory compliance, bias reduction, data quality, data integrity, and cost efficiency. Our research aims to generate synthetic educational dataset by leveraging generative AI techniques such as Autoencoder, variational autoencoder and Copula-GAN. Our experimental results shows the significant progress in generating educational dataset and represents the data distribution of synthetic and real data.

Start Date

2-7-2025 1:00 PM

End Date

2-7-2025 2:30 PM

Share

COinS
 
Feb 7th, 1:00 PM Feb 7th, 2:30 PM

Generative AI for Synthetic Data Creation: Building Mastery-Focused Educational Datasets

Volstorff A

Synthetic data is artificially generated data that mimics the statistical properties of real world data without exposing sensitive information. It is used in analysis, research, and deployments. Educational technology (EdTech) is an area where synthetic data can solve the problems of data scarcity, privacy concerns, regulatory compliance, bias reduction, data quality, data integrity, and cost efficiency. Our research aims to generate synthetic educational dataset by leveraging generative AI techniques such as Autoencoder, variational autoencoder and Copula-GAN. Our experimental results shows the significant progress in generating educational dataset and represents the data distribution of synthetic and real data.