# A Local False Discovery Rate based Assessment of Forensic and Biometric Matching System Capacity

## Presentation Type

Poster

## Student

Yes

## Track

Forensic Statistics

## Abstract

In biometric verification tasks, we are typically focused on making a comparison between two biometric samples, one associated with a query object and one associated with control sample(s) collected from a known or specified biometric source. Biometric individuality refers to the rate at which we encounter biometric samples from two distinct sources that are indistinguishable with respect to a biometric comparison technique this rate is typically used to characterize the corresponding system’s capacity. One proposed measure for the quality of an individualization biometric identifier is a random match probability (RMP) which measures the chance of observing two individuals in a relevant source population that are indistinguishable, with respect to that biometric. This can be investigated empirically by applying the biometric identifier to a representative sample. Although an empirical study cannot “prove” the uniqueness of the sources, it can potentially be used to show that the chance of observing two individual sources with indistinguishable profiles is very small.

For this research, the small arms propellants (SAP) powder dataset is utilized, focusing on the morphometric measurement mean for the size of the particles, based on the perimeter measurement. We can then do all pairwise comparisons between the samples of the powders and compare those means. The smokeless powders dataset represents 9 different distributors and a total of 154 unique brands. A single sample of particles was collected from each brand and analyzed, providing size and shape measurement features. This dataset consists of 39,944 rows and 11 columns. Each row corresponds to one SAP particle and each column corresponds to the features recorded for each particle such as Distributor, Brand, Perimeter, and additional morphometric measurements. In this research, we work with the raw similarity scores between two samples of traces by assuming if a similarity score arises from two indistinguishable sources, then the score will follow a uniform distribution. This is analogous to the use of p-values in statistical hypothesis testing. Using the Shapiro-Wilk test to analyze the particles within each source allows us to characterize the proportion of sources that follow a normal distribution. Mixture modeling based on ECDF Goodness-of-Fit statistics such as a minimum Cramer-von Mises, is used to estimate the chance that two randomly selected sources have a uniform distribution for their similarity scores, given they are indistinguishable. Employing these statistical methods, we are able to characterize distribution patterns and assess the indistinguishability of sources, if it exists.

## Start Date

2-6-2024 1:00 PM

## End Date

2-6-2024 2:00 PM

A Local False Discovery Rate based Assessment of Forensic and Biometric Matching System Capacity

Volstorff A

In biometric verification tasks, we are typically focused on making a comparison between two biometric samples, one associated with a query object and one associated with control sample(s) collected from a known or specified biometric source. Biometric individuality refers to the rate at which we encounter biometric samples from two distinct sources that are indistinguishable with respect to a biometric comparison technique this rate is typically used to characterize the corresponding system’s capacity. One proposed measure for the quality of an individualization biometric identifier is a random match probability (RMP) which measures the chance of observing two individuals in a relevant source population that are indistinguishable, with respect to that biometric. This can be investigated empirically by applying the biometric identifier to a representative sample. Although an empirical study cannot “prove” the uniqueness of the sources, it can potentially be used to show that the chance of observing two individual sources with indistinguishable profiles is very small.

For this research, the small arms propellants (SAP) powder dataset is utilized, focusing on the morphometric measurement mean for the size of the particles, based on the perimeter measurement. We can then do all pairwise comparisons between the samples of the powders and compare those means. The smokeless powders dataset represents 9 different distributors and a total of 154 unique brands. A single sample of particles was collected from each brand and analyzed, providing size and shape measurement features. This dataset consists of 39,944 rows and 11 columns. Each row corresponds to one SAP particle and each column corresponds to the features recorded for each particle such as Distributor, Brand, Perimeter, and additional morphometric measurements. In this research, we work with the raw similarity scores between two samples of traces by assuming if a similarity score arises from two indistinguishable sources, then the score will follow a uniform distribution. This is analogous to the use of p-values in statistical hypothesis testing. Using the Shapiro-Wilk test to analyze the particles within each source allows us to characterize the proportion of sources that follow a normal distribution. Mixture modeling based on ECDF Goodness-of-Fit statistics such as a minimum Cramer-von Mises, is used to estimate the chance that two randomly selected sources have a uniform distribution for their similarity scores, given they are indistinguishable. Employing these statistical methods, we are able to characterize distribution patterns and assess the indistinguishability of sources, if it exists.