Title

Session 3: Methods - Using Atypicality to Identify Outliers

Presenter Information/ Coauthors Information

Austin O' Brien, Dakota State University

Presentation Type

Event

Abstract

This presentation will outline the development and use of a probabilistic measure for outlier detection, referred to as atypicality. Given a set of objects, we can create a corresponding set of similarity scores between them. Assuming the set of scores has a normal distribution, we can estimate the score distribution’s parameters. We compute atypicality by comparing the likelihood of an object given these estimated parameters to the likelihood of bootstrapped samples. The atypicality measure is then used as a p-value in a hypothesis test, where the null hypothesis states that the object in question is similar to the remaining objects; the alternative hypothesis is that the object is an outlier. This can be used in a variety of applications, especially where we have multiple objects in multi-dimensional space.

Start Date

12-2-2018 11:00 AM

End Date

12-2-2018 12:00 PM

This document is currently not available here.

Share

COinS
 
Feb 12th, 11:00 AM Feb 12th, 12:00 PM

Session 3: Methods - Using Atypicality to Identify Outliers

University Student Union: Dakota Room 250 A/C

This presentation will outline the development and use of a probabilistic measure for outlier detection, referred to as atypicality. Given a set of objects, we can create a corresponding set of similarity scores between them. Assuming the set of scores has a normal distribution, we can estimate the score distribution’s parameters. We compute atypicality by comparing the likelihood of an object given these estimated parameters to the likelihood of bootstrapped samples. The atypicality measure is then used as a p-value in a hypothesis test, where the null hypothesis states that the object in question is similar to the remaining objects; the alternative hypothesis is that the object is an outlier. This can be used in a variety of applications, especially where we have multiple objects in multi-dimensional space.