# Development of Strategies for Estimating a Response Surface to Characterize a Black-box Algorithm in Terms of a White-box Algorithm

#### Abstract

In forensic identification of source problems, there is an increasing lack of explainability of the complex black-box algorithms for the assignment of evidential value. Generally speaking, black-box algorithms are designed with prediction in mind. Although the information fed into the algorithm and the features used to make the prediction are often known to the user, the complexity of the algorithm limits the ability of the end user to understand how the input features are used. On the other hand, more transparent algorithms (sometimes referred to as “white-box”) are typically less accurate even if they provide direct information on how the input object is directly used for predicting a class or outcome. In this work, we begin the development on a response surface that characterizes the output of a black-box algorithm with the output of a white-box algorithm. Using a set of handwriting samples, we use a complex black-box algorithm across multiple features to produce a set of pairwise scores and a simple, transparent algorithm that uses individual features to produce another set of pairwise scores. A generalized least squares method is used to test the null hypothesis that there is no relationship between the two types of scores. The outcome of the significance tests helps to determine which of the individual feature scores have an influence on the black-box scores.

*This paper has been withdrawn.*

Development of Strategies for Estimating a Response Surface to Characterize a Black-box Algorithm in Terms of a White-box Algorithm

Volstorff A

In forensic identification of source problems, there is an increasing lack of explainability of the complex black-box algorithms for the assignment of evidential value. Generally speaking, black-box algorithms are designed with prediction in mind. Although the information fed into the algorithm and the features used to make the prediction are often known to the user, the complexity of the algorithm limits the ability of the end user to understand how the input features are used. On the other hand, more transparent algorithms (sometimes referred to as “white-box”) are typically less accurate even if they provide direct information on how the input object is directly used for predicting a class or outcome. In this work, we begin the development on a response surface that characterizes the output of a black-box algorithm with the output of a white-box algorithm. Using a set of handwriting samples, we use a complex black-box algorithm across multiple features to produce a set of pairwise scores and a simple, transparent algorithm that uses individual features to produce another set of pairwise scores. A generalized least squares method is used to test the null hypothesis that there is no relationship between the two types of scores. The outcome of the significance tests helps to determine which of the individual feature scores have an influence on the black-box scores.