What is AI Intelligence?
This week we look at how AI is intelligence...and also how it is not.
Research Roundup
Jarvis Rocks
One of my favorite depictions of interacting with AI is Tony Stark talking to his Jarvis as he discovers time travel. It actually feels a little bit like my own interactions with LMMs. But neither GTP nor Gemini are a sentient comic book AI. So, what are the limits?
As a research assistant LMMs have genuinely leverageable strengths. While earlier versions of GPT would “hallucinate…fictional references 36.0%”, SoA models are down to 5.4%, providing a growing potential for literature searches both within and across fields. Of the time, respectively, although GPT-4 exhibited an evolving capacity to acknowledge its fictions. Cutting edge LLMs were also able to detect methodological issues in studies and generate synthetic data samples, at least for well-represented domains. But when data was limited, the heart of scientific exploration and discovery, no models “successfully predict…novel outcomes”.
Despite that final major limitations, or perhaps because of it, machine learning as research collaborator is a powerful mode. (I often imagine it is one of my grad students, knowing everything but understanding nothing.) This is far from a new idea. I’ve used Mathematica and Maple to improve my mathematics back in the 90s, and mathematics have been doing the same for decades prior. A new paper proposes “a process of using machine learning to discover potential patterns and relations between mathematical objects, understanding them with attribution techniques and using these observations to guide intuition and propose conjectures”. The authors reveal how their approach “led to meaningful mathematical contributions on important open problems”.
Later this week I’ll share where AI falls down (reasoning), but its superhuman statistical learning can be a powerful compliment to creative labor.
What Is AI Intelligence?
AI is true intelligence, but too many smart people are naively mistaking the intelligence it shares with us, even exceeding us (statistical, model-free learning) with robust reasoning and model-based learning. If you want to understand LLMs, start wth what they are: auto-complete on a massive scale.
Looking at LLMs as “next-word prediction over Internet text” suggests that its strengths are about the probabilities of patterns in its training sets. If GPT, for example, was reasoning its way through problems rather than learning statistical patterns, then the same problem should be equally solvable whether presented in high vs low probability language. But in fact, “GPT-4’s accuracy at decoding a simple cipher is 51% when the output is a high-probability sentence but only 13% when it is low-probability, even though this task is a deterministic one for which probability should not matter”. Before writing GPT off, know that novice humans Make these same sorts of errors, mistaking the surface details for the deep structure.
The paper above isn’t a one-off: “current LLMs cannot perform genuine logical reasoning; they replicate reasoning steps from their training data”. All LLMs performances “significantly deteriorates as the number of clauses in a question increases” as measured with a novel benchmark “that allows for the generation of a diverse set of questions”. Even just adding “a single clause that seems relevant to the question causes significant performance drops (up to 65%) across all state-of-the-art models, even though the clause doesn't contribute to the reasoning chain needed for the final answer.”
LLM intelligence is real but (for now) only statistical and model-free. It is all pattern and no proof.
<<Support my work: book a keynote or briefing!>>
Want to support my work but don't need a keynote from a mad scientist? Become a paid subscriber to this newsletter and recommend to friends!
Weekly Indulgence
Here a photo from my whirlwind visit to #NYC 2 weeks ago. I…
- …chatted with JPMC about AI
- …spoke at the UN about “robot-proofing our kids”
- …was interviewed about #TheHumanTrust and postpartum depression by iHeartRadio, and
- …waxed eloquent after receiving the #StartOut Trailblazer award. “You’ve armed a dangerous woman!”
Stage & Screen
- How cool to get written up in Nature! We been busting our ass at #Optoceutics putting our time and money into science rather than marketing.
- October 23, Toronto: Let's spend the day together at Metropolitan University's Future of Work conference
- October 28-19, Rome: Are you as shocked as I that this is my first ever visit to Italy? I'll be talking AI and Humans for the UN.
- November 4, Copenhagen: Novo Nordisk AI Day!
- December 7-8, London: Oxford International Speakers Panel: "What it means to be human in the age of AI"
Does your company, university, or conference just happen to be in one of the above locations and want the "best keynote I've ever heard" (shockingly spoken by multiple audiences last year)?
SciFi, Fantasy, & Me
I’m reading Dragon Steele Prime by Brandon Sanderson, a book he draft long ago but never released. I admire the idea of sharing earlier versions of ideas that populate his most popular books; it’s something of a writing lesson in reworking ideas until they…work.
Also, I finished the The Legend of Vox Machina episodes. Was Chateau Shorthall in campaign one? It feels a bit like the mage tower from campaign two. It’s also an idea for a magic item I’ve wanted from some time but which the rules of D&D don’t allow. In fact, you might say that the animated series plays to the potential fun and creativity of magic and action the D&D’s RAW obsession doesn’t support.
Vivienne L'Ecuyer Ming
Follow more of my work at | |
---|---|
Socos Labs | The Human Trust |
Dionysus Health | Optoceutics |
RFK Human Rights | GenderCool |
Crisis Venture Studios | Inclusion Impact Index |
Neurotech Collider Hub at UC Berkeley | UCL Business School of Global Health |