Image: Violet Dashi/Nadia and Simple Line/Adobe Stock
Even though they were created by humans, artificial intelligence models still hold more mystery than the Knives Out franchise.
This has been dubbed in the AI industry as the “black box” problem – essentially, these models have a tendency to do things that aren’t easily explained by the people observing them (ex: hallucinations).
However, newly published research from Anthropic, one of the top companies in the AI industry, attempts to shed some light on that topic by breaking down how one of its AI models, Claude Sonnet, actually functions.
How artificial intelligence works: AI systems are set up roughly similar to the human brain – layered neural networks that intake and process information, then make predictions (or “decisions”) based on said info.
On a granular level, what an AI model is “thinking” before it responds is represented by neuron activations, or long strings of numbers, without a clear meaning.
When zooming out, however, Anthropic’s researchers were able to match Claude’s patterns of neuron activations, called features, to human-interpretable concepts. They were also able to identify features “close” to each other – basically how the model associates one feature to another.
Bottom line: While a good bit of research is still needed to unpack how AI actually works, Anthropic’s research is the deepest to date – and one small step for man in that direction.
Let's make our relationship official, no 💍 or elaborate proposal required. Learn and stay entertained, for free.👇
All of our news is 100% free and you can unsubscribe anytime; the quiz takes ~10 seconds to complete