# Atypicality Based Measures for the Identification of Counterfeit Aspirin

#### Abstract

In this work we are focused on building a pattern recognition system for chemometric data arising from the LC-MS.MS spectrometry analysis of brand name aspirin. Originally, the goal of the work was to build a set of discriminate functions that can separate between known brands of manufactured aspirin, similar to LDA, which finds a set of projections that optimize linear separation relative to within class variation [1]. However, these functions must assign each observation to a known class, even if the likelihood of an observation having arisen from any given class is very small. As an alternative, we investigate the use of atypicality measures as a way around this issue. The atypicality of an observation with respect to a given population (or class) is the chance of drawing a new sample from a given class that has a greater likelihood of being observed than the actual observation that we are considering assigning to said class. A pattern recognition system based off atypicality measures would assign an observation to the class with the smallest atypicality given that it is not above some threshold. This threshold can be considered a method for determining if an observation is likely to not belong to any known classes at all. We will perform a simulation study comparing the effectiveness of atypicality based methods to LDA and QDA methods when the assumptions of the discriminate functions are satisfied, and then apply the three methods to a chemometric data set related to the analysis of aspirin pills.

*This paper has been withdrawn.*

Atypicality Based Measures for the Identification of Counterfeit Aspirin

Volstorff A

In this work we are focused on building a pattern recognition system for chemometric data arising from the LC-MS.MS spectrometry analysis of brand name aspirin. Originally, the goal of the work was to build a set of discriminate functions that can separate between known brands of manufactured aspirin, similar to LDA, which finds a set of projections that optimize linear separation relative to within class variation [1]. However, these functions must assign each observation to a known class, even if the likelihood of an observation having arisen from any given class is very small. As an alternative, we investigate the use of atypicality measures as a way around this issue. The atypicality of an observation with respect to a given population (or class) is the chance of drawing a new sample from a given class that has a greater likelihood of being observed than the actual observation that we are considering assigning to said class. A pattern recognition system based off atypicality measures would assign an observation to the class with the smallest atypicality given that it is not above some threshold. This threshold can be considered a method for determining if an observation is likely to not belong to any known classes at all. We will perform a simulation study comparing the effectiveness of atypicality based methods to LDA and QDA methods when the assumptions of the discriminate functions are satisfied, and then apply the three methods to a chemometric data set related to the analysis of aspirin pills.