Unlocking AI’s Black Box: New Formula Explains How They Detect Relevant Patterns

Neural Network Algorithm Art Concept

A UC San Diego team has uncovered a method to decipher neural networks’ learning process, using a statistical formula to clarify how features are learned, a breakthrough that promises more understandable and efficient AI systems. Credit: SciTechDaily.com

The findings can also be applied to enhance the efficiency of various machine learning frameworks.

Neural networks have been powering breakthroughs in artificial intelligence, including the large language models that are now being used in a wide range of applications, from finance, to human resources to healthcare. But these networks remain a black box whose inner workings engineers and scientists struggle to understand. Now, a team led by data and computer scientists at the University of California San Diego has given neural networks the equivalent of an X-ray to uncover how they actually learn.

The researchers found that a formula used in statistical analysis provides a streamlined mathematical description of how neural networks, such as GPT-2, a precursor to ChatGPT, learn relevant patterns in data, known as features. This formula also explains how neural networks use these relevant patterns to make predictions.

“We are trying to understand neural networks from first principles,” said Daniel Beaglehole, a Ph.D. student in the UC San Diego Department of Computer Science and Engineering and co-first author of the study. “With our formula, one can simply interpret which features the network is using to make predictions.”

The team presented their findings in the March 7 issue of the journal Science.

Why does this matter? AI-powered tools are now pervasive in everyday life. Banks use them to approve loans. Hospitals use them to analyze medical data, such as X-rays and MRIs. Companies use them to screen job applicants. But it’s currently difficult to understand the mechanism neural networks use to make decisions and the biases in the training data that might impact this.

“If you don’t understand how neural networks learn, it’s very hard to establish whether neural networks produce reliable, accurate, and appropriate responses,” said Mikhail Belkin, the paper’s corresponding author and a professor at the UC San Diego Halicioglu Data Science Institute. “This is particularly significant given the rapid recent growth of SciTechDaily