My 18-Month Battle With Expert Hallucinations

My 18-Month Battle With Expert Hallucinations

A cautionary tale of how both AI and human experts can be completely wrong

After 18 months of failed treatments and contradictory advice, I discovered something shocking: expert hallucination isn't limited to artificial intelligence. Sometimes, human experts are even worse at seeing what isn't there.

For three years, I was a running machine. 2,500 kilometers per year, 50 kilometers per week … rain or shine. Consistent as the Swiss train schedule.

Then, by the end of 2023, my body staged a coup.

A deep, nasty pain took up residence in the top of my left leg, high in the buttock region. The cause seemed blindingly obvious: I'd finally overdone it. My hamstring was giving up.

ChatGPT agreed. "Hamstring tendonitis," it declared with its usual confidence. Inflamed tendon. Classic overuse injury. Time to rest, buddy.

So I rested. And rested. And rested some more.

What followed was twelve months of the most maddening Groundhog Day imaginable. I'd take a break for months, feeling the pain fade like a bad memory. And then, believing I'd sufficiently healed, I'd lace up my running shoes and within five kilometers, my hamstring would remind me exactly why optimism is for suckers.

Back to the AI oracle I went. ChatGPT, Claude, whoever would listen—they all sang the same gospel: "hamstring tendonitis." Poor blood flow to tendons, they explained. Healing takes time. More rest. Always more rest.

I'd had this exact problem eight years earlier with the other leg. The advice of the sports doctor back then was: rest a few months and do some exercises to strengthen the weak muscles. The pattern was clear, the solution obvious, the outcome inevitable: another couple of months of doing absolutely nothing while my fitness evaporated like morning dew in the Caribbean.

By early 2025, after a full year of this medical Möbius strip, I finally cracked and consulted an actual human, expecting to hear the same mantra I'd heard eight years before. But this time, I consulted a certified physiotherapist specializing in manual therapy.

After examining me and painfully prodding my posterior until I questioned both his qualifications and his humanity, he delivered a diagnosis that would have made Freud proud: there was nothing wrong with my body. The problem, he announced, was in my head.

If I could run 2,500 kilometers per year, he argued, why would my body suddenly throw a tantrum over a few measly kilometers? He started asking the important questions: What happened in 2023? Any stress? Work problems? Relationship drama? Uncertainty about the future? Was I perhaps subconsciously projecting anxiety onto my glutes?

I'll give him points for creativity, but the whole thing reeked of hammer-and-nail syndrome. My friends call me a "mental flatliner"—so emotionally stable you could calibrate seismographs off my mood swings. More importantly, his psychosomatic theory couldn't explain why eight-hour flights and twenty-hour car rides turned my left buttock into a full-time complaints department.

That's when I realized his tell. Sure enough, part of his credentials was the fact that he wrote an entire book about psychosomatic injuries. Every client walking through his door was apparently a potential case study for his pet theory. When your favorite tool is a psychological hammer, every client's aching body part looks like a repressed emotion.

I cancelled my remaining appointments.

So there I was, trapped between two brands of bullshit. The AIs insisted on rest while my tendon stubbornly refused to heal. The human expert wanted to psychoanalyze my butt. Neither approach was working, and I was getting tired of hobbling around with a sore arse.

It was time for some good old-fashioned rebellion against expert opinion.

What if we all had it backwards, I thought? What if, instead of starving my injury of activity, I fed it just enough to wake it up? Blood flow problems? Maybe some light running could help circulation. Instead of complete rest, what if I stayed active just enough to help my body to heal, but not enough to make things worse?

I started small. Tiny runs—2-3 kilometers, a few times per week. I was aiming for the Goldilocks zone: just enough stimulus to nudge the healing process without triggering another cycle of pain and frustration. The approach felt like walking a tightrope while juggling, but I was done trusting the certainties of experts (human and AI alike) about my body.

Six months later, the injury has practically vanished. I'm back to 25 kilometers per week and can run over ten kilometers without my hamstring filing a complaint. The pain that has haunted me for a year and a half has disappeared like a politician's promises after election day.

Curious about what had actually worked, I fed the entire saga to Gemini for a post-mortem analysis. The answer was both illuminating and infuriating.

I'd never had "tendonitis" in the first place, Gemini said. Neither the AIs nor the human expert had bothered to consider the obvious alternative: proximal hamstring tendinopathy (PHT)—a condition involving degeneration and failed healing in the tendon, not inflammation.

The distinction matters. Inflammation responds to rest. Degeneration requires the opposite. My pain from sitting was just my body weight compressing the damaged tendon against my sit-bone, not some mysterious psychosomatic manifestation of workplace stress.

The earlier AIs' advice to "rest" had created a self-reinforcing cycle of weakness. Every break kept my tendon weak and fragile, turning each return to running into a guaranteed re-injury. I was caught in a "rest-weaken-re-injure" loop that could have continued indefinitely if I'd kept following that logic.

The human expert, meanwhile, was so busy looking for psychological nails that he'd missed the mechanical hammer entirely. One-tool experts are dangerous precisely because they're so confident in their singular domain. They don't see problems—they see confirmation of their worldview.

My "Goldilocks" rebellion wasn't just lucky guesswork. I'd accidentally stumbled onto the gold-standard, evidence-based treatment for tendinopathy: progressive loading. Those modest 2-3 kilometer runs weren't just maintaining fitness. They were applying mechanical load that signaled my tendon cells to rebuild stronger and more organized tissue. I'd ignored both artificial and human intelligence to accidentally discover the correct treatment through pure stubbornness and a refusal to accept expert consensus.

But the real lesson isn't about running injuries or hamstring tendons. It's about the dangerous mythology we've built around expertise in the age of AI.

We've created a false dichotomy: either trust the machine or trust the human. But both can be spectacularly wrong, often for similar reasons. AIs are pattern-matching machines trained on existing knowledge, which means they'll confidently regurgitate conventional wisdom even when it's outdated, incomplete, or simply mis-categorized. Human experts, meanwhile, are walking bundles of cognitive bias who see their specialty everywhere they look.


Do you like this post? Please consider supporting me by becoming a paid subscriber. It’s just one coffee per month. That will keep me going while you can keep reading! PLUS, you get my latest book Human Robot Agent FOR FREE! Subscribe now.


The AIs failed because they were trained on medical literature that treats tendonitis and tendinopathy as similar things, despite research showing they're different conditions requiring opposite treatments. They gave me the statistically most likely diagnosis based on a mix-up of categories.

The human expert failed because he'd found his intellectual home in psychosomatic medicine and wasn't about to let a simple mechanical injury ruin his favorite narrative. When you've written a book about minds creating body problems, every body problem looks like evidence of a mental cry for help.

Both human and AI failed because they were too confident in their knowledge and too incurious about alternative explanations. The AIs couldn't think beyond their training data. The human couldn't think beyond his pet theory. Meanwhile, the solution required neither artificial intelligence nor human expertise—just careful attention to what was actually happening, a willingness to experiment, and enough intellectual stubbornness to question received wisdom.

This pattern repeats everywhere. In business, we see the same dynamic between AI-powered analytics and human "domain experts." The algorithms confidently extrapolate from historical patterns, while the humans confidently apply their favorite frameworks. Both miss the messy reality that doesn't fit their thinking models.

The important skill in the future of work isn't choosing between human and artificial intelligence. It's knowing when to ignore both. Sometimes the smartest move is to tune out the experts, human plus machine, pay attention to what's actually happening, and run your own experiments. It has the curse of knowledge written all over it. Sometimes an amateur with fresh eyes accidentally discovers what the experts' training prevents them from noticing.

Don't get me wrong—I'm not advocating for wholesale rejection of expertise. When you're building a bridge or performing surgery, you want people who've spent decades learning their craft. But when you're dealing with complex, socio-technical problems that don't fit neat categories, expertise can be a trap.

My hamstring is now happily carrying me through 25-kilometer weeks, but the real victory was learning to trust my own observations and reasoning over pet theories. Sometimes the best expertise is knowing when to ignore the experts. The future belongs to those who can navigate between overconfident AI and overconfident humans, using both as sources of insight while trusting neither as oracles. Sometimes the wise choice is to ignore the wise and trust your own stubborn curiosity.

Jurgen

Back to blog
Jurgen Appelo

"Eighty percent of everything is noise."