Dissertation - Open Access
Doctor of Philosophy (PhD)
Mathematics and Statistics
Christopher P. Saunders
atypicality, kernel, similarity scores
This dissertation outlines the development and use for a new probabilistic measure for categorization, referred to as atypicality. Given a set of known source objects, we can create a corresponding set of similarity scores between them. Assuming the set of scores has a normal distribution, we can estimate its parameters. Then, we can introduce new trace objects to the problem, and compute similarity scores for them. The main goal of the atypicality score is to determine if the new trace objects are similar to the source objects. To do this, we bootstrap many new scores using the estimated parameters (from the source scores), and compare the likelihood of these new scores to the scores belonging to the trace objects. We then make note of how often the trace objects have a higher likelihood value. The bootstrap result will be a number between zero and one, with smaller values indicating that the trace objects are not similar to the source objects. This is the atypicality value. We can use this as a p-value in a hypothesis test where the null hypothesis states that the trace objects are similar to the source objects. This can be used in a variety of applications, especially where we have multiple trace objects in multi-dimensional space. This dissertation will outline the development and use of the atypicality measure, show the results when the objects and scores are not normal, discuss the power of atypicality, and provide a comparison to support vector machines.
Includes bibliographical references (pages 75-84)
Number of Pages
South Dakota State University
In Copyright - Educational Use Permitted
O'Brien, Austin, "A Kernel Based Approach to Determine Atypicality" (2017). Theses and Dissertations. 1711.