Five universities just ran the same experiment on twelve hundred and twenty-two people. Half got ChatGPT. Half got nothing. Ten minutes later, they took the AI away and ran the test that matters. And what came back isn't a headline about AI replacing jobs or passing the bar exam. It's smaller. Quieter. And frankly, more unsettling.
Because the people who had the AI? They didn't get dumber. They got something else. Something you can't see happening while it happens. They lost the habit of staying in a hard problem. And that change, measured in minutes, is the thing a lot of parents, teachers, managers, and frankly anyone who makes a living thinking, should probably read about today.
Here's what the data says, what the critics say back, and what to actually do about it.
The study you probably haven't heard about yet
The paper is titled "AI Assistance Reduces Persistence and Hurts Independent Performance." It went up on arXiv on April 7, 2026. The authors are Grace Liu at Carnegie Mellon, Brian Christian and Tsvetomira Dumbalska at Oxford, Michiel Bakker at MIT, and Rachit Dubey at UCLA. If the Brian Christian name rings a bell, that's because he's the guy who wrote The Alignment Problem. So this isn't five nobodies making a viral claim on Twitter. This is a serious team, with serious methodology, publishing a preprint that should get serious attention.
Here's the setup. Three randomized controlled trials. Real random assignment, real control groups, real pretests to make sure the AI group and the no-AI group had the same baseline ability before the experiment started. Total sample: 1,222 adults recruited on Prolific, paid a small fee, put through sessions that each lasted about fifteen minutes.
Two domains. The first two experiments used fraction problems. Stuff like "simplify five-sixths minus one-quarter, multiplied by three-fifths plus one-tenth." The third experiment used reading comprehension problems adapted from free SAT practice materials. Participants in the AI condition had ChatGPT available during the "learning" phase of the session. Then, at the end, the AI was taken away and everyone had to solve new problems on their own. That unassisted test was the thing the authors actually cared about.
The results show up in three figures, and the shape is the same in every one. During the AI phase, the AI group does great. They solve more problems. They give up less often. The AI is working the way AI is supposed to work. But the moment the AI gets unplugged, something breaks. The AI group solves fewer problems than controls. They skip more. And they skip at rates that, in the reading comprehension experiment, were eight times higher than the group that never touched AI in the first place.
That ratio again, because it's worth sitting with for a second: eight times.
The numbers, in plain English
Experiment one used 354 people solving fractions. The AI group, tested without AI, scored 57 percent on their final math problems. The control group scored 73. That's a gap the statisticians describe with the phrase "Cohen's d of minus zero point four two," which is researcher-speak for "this is a real, moderate-sized effect, and we're ninety-nine percent confident it's not random." The skip rate nearly doubled in the AI group. 20 percent skipped in the AI condition, 11 in the control. Same problems. Same difficulty. Only difference: ten minutes ago, one group had a chatbot, and the other didn't.
Experiment two took the same design and added a mechanism probe. 667 people, again on fractions, but this time the researchers asked each participant how they'd used the AI. Did they ask for direct answers? Ask for hints? Ignore it? 61 percent said they'd used the AI primarily to get direct solutions. Those are the people who took the biggest hit on the follow-up test. Their test solve rate dropped 10 percentage points relative to their own pretest baseline, while the control group's baseline barely moved. 27 percent asked for hints instead. That group saw a smaller drop. And the 12 percent who didn't really use the AI at all basically scored like the controls, which confirms the obvious: the more you relied on the model, the more the study found downstream persistence decay.
Experiment three switched domains entirely. Reading comprehension, 201 people. AI-assisted group solved 76 percent of test problems unassisted. Controls hit 89. The skip rate difference was the most dramatic in the whole paper: eight percent in the AI condition versus one percent in the control. Effect size minus 0.42 again, p-value 0.007, Cohen's d of 0.42 on the skip rate. If this were a medical trial, these are the kind of numbers you'd take to the FDA.
And the window for all of this, remember, was about ten to fifteen minutes of AI exposure per session.
Why the title uses the word persistence
Read the abstract once, then read it again. The authors are very careful about the framing. They do not say AI makes people dumber. They do not say it lowers IQ. They say it reduces persistence. It "impairs unassisted performance." It trains people to quit on hard problems.
That distinction matters. Because persistence isn't some vague virtue. Cognitive psychology has been writing about this for decades. Productive struggle is the actual mechanism by which humans build new skills. When you sit in a hard problem and don't know what to do and stay with it anyway, your brain is doing the thing it was designed to do. It's stretching. It's building capacity to handle the next harder problem. And critically, that's also when you learn what you're capable of. The research calls this "metacognitive calibration." It's how you develop an accurate sense of your own ability, which is itself the scaffolding for long-term persistence.
The uncomfortable claim the paper is making is this: current AI assistants are specifically engineered to collapse that process. The products are optimized for instant helpful answers. Saying no is not in the product spec. There's no setting in ChatGPT that says "don't help me yet, let me sit with this." So the user's cognitive rhythm keeps getting interrupted by the same bargain: you feel stuck, you prompt, you get an answer, the feeling of being stuck goes away.
Repeat that bargain enough times and a new internal rule quietly installs itself. When something is hard, the answer is one prompt away. And when the researchers unplugged the AI and the old rule wasn't applicable anymore, the brain didn't switch back to the "stay in the hard problem" mode. It defaulted to the new rule. It gave up earlier. It skipped more. On identical-difficulty problems.
The boiling frog angle
The outlet Futurism picked this up and gave it the boiling frog framing, which is probably the most honest media read of the finding. Frogs tossed into boiling water jump out. Frogs sitting in slowly warming water allegedly don't. The metaphor is literally false as a piece of biology, but it's a useful piece of psychology. The thing that makes the study scary isn't that ten minutes of ChatGPT breaks your brain. It's that the effect is subtle enough, in a single session, that participants didn't seem to notice it had happened to them.
If ten minutes is enough to shift the behavior of a thousand people in a measurable way, the natural next question is what happens over months of daily use. What happens when the ChatGPT browser tab is open during every work session, when students are using AI to write every essay, when junior analysts are outsourcing every synthesis task. The paper doesn't claim to have measured any of that. The authors are explicit. Effects were observed in single 10-to-15-minute sessions, and whether those effects accumulate longitudinally is something they haven't yet tested.
But the direction of the finding creates a forward-looking question that has to be asked. If short-term exposure nudges persistence down, does sustained exposure drive it down farther? Does it recover when AI is removed? Does it compound across domains, so that giving up on a hard prompt translates into giving up on a hard spreadsheet, and eventually giving up on a hard conversation?
Those are hypotheses, not conclusions. And the honest answer is: we don't know yet. A careful read of the paper gives you confidence that the direction is real. It does not give you confidence about the magnitude over time. That caveat should live next to every conversation about this research.
The strongest counter-arguments, given fairly
It would be dishonest to finish this post without addressing the real pushback on this study, so here it is, laid out seriously.
The first objection is that the tasks were narrow. Fraction problems and reading comprehension passages from SAT practice materials are specific, constrained cognitive tasks. They're not creative writing. They're not coding. They're not synthesis, strategy, or decision-making. A skeptic can fairly argue that the persistence effect shown in the paper may not generalize to the kinds of real-world work most adults actually do with AI. Maybe for open-ended creative tasks, AI increases persistence rather than decreasing it, because the model helps you get past blank-page paralysis and into momentum. That's a legitimate hypothesis the paper doesn't rule out.
The second objection is that the time window is too short to draw strong conclusions. Fifteen-minute sessions are not the same as months of daily use. An optimist could argue the effect is a one-time artifact: participants had to re-acclimate to the "no AI" condition, and once they recalibrated to the new task environment, their performance would recover. The researchers don't dispute this. They explicitly flag it as a limitation. If a follow-up study tests the same people a week later, in both conditions, the effect could partially or fully wash out.
The third objection is about the hint-versus-answer distinction. The paper actually supplies its own strongest counter-argument here. The participants who used the AI for hints, rather than direct answers, saw a meaningfully smaller drop in downstream performance. That's an important nuance. It suggests the effect isn't "AI is bad." It's "using AI as a crutch is bad." The same model, used the same way a good tutor would use it, produces a much smaller persistence cost. A pro-AI skeptic can fairly say: the study doesn't prove AI is harmful. It proves that asking AI to do your thinking for you is harmful. Which, honestly, isn't a surprise.
The fourth objection is that this is a preprint. It has not yet gone through peer review. It has not yet been replicated by independent labs. The authors are careful. Their methodology looks clean. But history has plenty of preprints that cited impressive effect sizes and then failed to hold up when the broader research community got their hands on the data. Journalistic prudence says: treat the finding as provisional. Don't ban ChatGPT from your school district on the strength of one preprint. Wait for replication.
Each of those objections is real. None of them, on their own, invalidates the study. But together they paint a picture that's more nuanced than "AI is rotting your brain in ten minutes." The honest summary is closer to: when the AI does the cognitive work for you, your own capacity for that cognitive work appears to weaken, even over short exposures, in controlled conditions, in at least two domains. The magnitude, durability, and generality of that effect are still open questions. Anyone who tells you otherwise is either overclaiming or selling something.
What the people building these systems should hear
The closing line of the paper is worth quoting, because it's quietly the most pointed thing in it. The authors write that AI model development should prioritize scaffolding long-term competence alongside immediate task completion. Read that again. They are not calling for AI to be banned. They are not calling for regulation. They are calling on the people who build these products to change what the product is optimized for.
Right now, a large language model is optimized to be maximally helpful in the moment. That's what the system prompts say. That's what RLHF rewards. That's what the product managers measure. Helpful, on the marginal prompt, today. The study suggests that function is, over time, in direct tension with another function that nobody is currently optimizing for, which is: does this make the user more capable a year from now, or less? Is the user growing, or are they atrophying?
Those two functions point in different directions. A great mentor knows when not to help. A chatbot, as currently shipped, does not. It cannot. The product has no concept of when the right answer to your question is another question. It has no ability to say "I know you want the answer, but I think you're close enough that you should try five more minutes first." Because if it did, users would churn to the product that didn't.
That's the real structural issue the paper points at. The incentives of the AI market, as currently structured, are aligned against user cognitive growth. They are aligned with instant satisfaction, which is the same shape as the incentive structure of every addictive consumer product in the last thirty years. The difference is, this one lives at the center of how we think. Slot machines didn't reshape how you plan your week. Infinite scroll didn't rewire how you approach a hard problem. This might.
The 10-Minute Rule: what actually works
Okay. Assuming the study is directionally right, and assuming you believe cognitive capacity is worth protecting, what does the research actually tell you to do?
The answer is small and specific, which is usually a good sign. The strongest signal in the paper is the difference between using AI for direct answers versus using it for hints. The direct-answer users lost persistence. The hint users lost less. So the first rule is: change how you use the model, not whether you use it.
The simplest way to operationalize that, and the one I'd give a teenager or a junior employee tomorrow, is the 10-Minute Rule. When you hit a hard problem, the kind you'd normally paste straight into ChatGPT, don't. Not yet. Set a timer. Spend ten minutes with just you and the problem. Paper. Pencil. No tabs open. No music designed to "help you focus." Just the gap between what you know and what the problem wants.
Ten minutes is specifically calibrated. It's long enough that you'll get uncomfortable. It's short enough that it's not a heroic commitment. You'll find, at the end of ten minutes, one of three things has happened. Sometimes the answer will come. Sometimes you'll realize you were asking the wrong question. And sometimes you'll still be stuck, which is fine. At ten minutes, you bring the AI in. But now you bring it in with a real question, shaped by ten minutes of friction, and the answer you get will teach you something because you did the work to earn the context that makes the answer useful.
That's the practical move for an individual.
For parents and teachers, the prescription is adjacent. Don't ban the tool. That's a losing battle and, frankly, the wrong one. Instead, build rituals where the first draft or the first attempt is protected from AI. If a kid writes an essay, ask to see what they had before they prompted the model. If a student works a problem set, require one page of handwritten scratch work. The ten-minute rule at a smaller granularity. Protect the gap.
For managers, the prescription is cultural. If your team reviews AI outputs but doesn't review the thinking behind them, you are slowly training an organization of copyeditors. Build review rituals that surface the prompt, the thinking, and the decision, not just the output. The best managers, historically, have been the ones who protected time for their people to think. The managers who do this well in the next decade will be the ones who protect time for their people to think without AI at their elbow.
For the product teams at OpenAI, Anthropic, Google, and the rest, the prescription is a design question that no one in the AI industry has really tried to answer yet. What does a helpful AI look like that also knows when not to help? What does a model look like that says "hold on, try this on your own for a minute, and come back"? There's no clean answer. But the paper, correctly, puts the question on the table.
The bigger frame, and the only honest closing line
Every technology that reshaped how humans think produced a version of this worry. Socrates thought writing would weaken memory. The printing press was accused of devaluing deep reading. Calculators were going to end numeracy. The internet was going to end attention. Some of those worries were wrong. Some were right. Most were somewhere in between.
What's different about this one, and what makes the Grace Liu paper worth paying attention to, is that the thing changing is upstream of all of that. It's not memory. It's not reading. It's not attention. It's the capacity to stay in a hard problem. And that capacity is the meta-skill that underlies every other cognitive skill humans have. If that's the thing eroding in ten-minute increments, across hundreds of millions of users, every day, then the long-run stakes are real.
The paper doesn't prove that's happening. It proves a narrower thing, in controlled conditions, at a small time scale. But the narrower thing is enough to take seriously. You don't have to catastrophize. You don't have to delete ChatGPT. You just have to build one small habit that protects the gap. Ten minutes, pencil, paper, no tabs. Then the AI.
Try it once today. Not next week. Today. On the next problem that feels a little too hard. Time it. Watch what happens to your brain when you give it permission to sit in the uncomfortable part before asking for help.
Then come back and argue with me about it.
That's the kind of argument this technology, and this moment, and frankly your own mind, deserves.