Document Type

Dissertation - Open Access

Award Date


Degree Name

Doctor of Philosophy (PhD)


Mathematics and Statistics

First Advisor

Christopher P. Saunders


atypicality, kernel, similarity scores


This dissertation outlines the development and use for a new probabilistic measure for categorization, referred to as atypicality. Given a set of known source objects, we can create a corresponding set of similarity scores between them. Assuming the set of scores has a normal distribution, we can estimate its parameters. Then, we can introduce new trace objects to the problem, and compute similarity scores for them. The main goal of the atypicality score is to determine if the new trace objects are similar to the source objects. To do this, we bootstrap many new scores using the estimated parameters (from the source scores), and compare the likelihood of these new scores to the scores belonging to the trace objects. We then make note of how often the trace objects have a higher likelihood value. The bootstrap result will be a number between zero and one, with smaller values indicating that the trace objects are not similar to the source objects. This is the atypicality value. We can use this as a p-value in a hypothesis test where the null hypothesis states that the trace objects are similar to the source objects. This can be used in a variety of applications, especially where we have multiple trace objects in multi-dimensional space. This dissertation will outline the development and use of the atypicality measure, show the results when the objects and scores are not normal, discuss the power of atypicality, and provide a comparison to support vector machines.


Includes bibliographical references (pages 75-84)



Number of Pages



South Dakota State University


In Copyright - Educational Use Permitted