Variable Screening of Social Determinants of Health Data With End Stage Kidney Disease Risk Scores
Presentation Type
Poster
Student
Yes
Abstract
Variable screening has been shown to successfully select important variables in sparse datasets while maintaining informative power. The Sure Independence Screening (SIS) is one such method used for large data. To that end, in this work, we apply variable screening to Social Determinants of Health (SDOH) data obtained from the Agency for Healthcare Research and Quality (AHRQ) database against scores based on predicted risk of mortality to End Stage Kidney Disease (ESKD), to identify important variables for predicting these risk scores. We performed this variable screening at the Zip Code and county geographic levels on a dataset containing the entire United States of America as well as a subset containing only the Indian Health Services Great Plains region, using variations of SIS algorithm. In total, multiple variable screening models were fitted then used to rank variables based on their selection prevalence. Our results demonstrate that the variable screening process can identify important variables associated with ESKD mortality risk score and their variation by region.
Start Date
2-7-2025 1:00 PM
End Date
2-7-2025 2:30 PM
Variable Screening of Social Determinants of Health Data With End Stage Kidney Disease Risk Scores
Volstorff A
Variable screening has been shown to successfully select important variables in sparse datasets while maintaining informative power. The Sure Independence Screening (SIS) is one such method used for large data. To that end, in this work, we apply variable screening to Social Determinants of Health (SDOH) data obtained from the Agency for Healthcare Research and Quality (AHRQ) database against scores based on predicted risk of mortality to End Stage Kidney Disease (ESKD), to identify important variables for predicting these risk scores. We performed this variable screening at the Zip Code and county geographic levels on a dataset containing the entire United States of America as well as a subset containing only the Indian Health Services Great Plains region, using variations of SIS algorithm. In total, multiple variable screening models were fitted then used to rank variables based on their selection prevalence. Our results demonstrate that the variable screening process can identify important variables associated with ESKD mortality risk score and their variation by region.