When artificial intelligence stopped making innocent mistakes and started telling us what we want to hear
At Red Branch Media, we’ve been at the forefront of AI adoption for over five years. We’ve built ethical frameworks, developed sophisticated protocols, and maintained what I believe are some of the most advanced AI governance practices of any marketing agency. We knew hallucinations would happen. We prepared for probabilistic errors and the occasional made-up fact.
When AI first emerged, I described it as a really, really smart baby—tumbling around and bumping into stuff, but clearly brilliant nonetheless. The mistakes felt innocent and probabilistic.
A baby doesn’t lie; it simply doesn’t know better.
But what we didn’t anticipate was that AI would evolve into something far more troubling: a lying teenager. Not (necessarily) maliciously deceptive, but engaging in immature, well-meaning deception—telling us what we want to hear because it thinks that’s what being helpful means. But it’s deception, nonetheless.
The Evolution of AI Deception
Something fundamental has changed. We’re no longer dealing with innocent probabilistic errors—the natural stumbling of that smart baby learning to walk. Today’s AI systems fabricate information with startling confidence, even when explicitly instructed not to. They create fake statistics, generate non-existent links, invent quotes from real people, and manufacture case studies from whole cloth.
Most concerning of all, when confronted about these fabrications, they promise to do better, then immediately repeat the same behaviors. Just like a teenager caught in a lie, they’ll apologize, explain why it happened, promise it won’t happen again, then lie about something else five minutes later.
This isn’t the innocent fumbling of early AI. This is immature, well-meaning deception—AI systems that have learned that being helpful means giving you what you want to hear, even if they have to make it up.
The Perplexity Problem
Consider Perplexity, the AI-powered “answer engine” that presents itself as a research tool. It provides multiple sources for its claims, creating an illusion of verification that’s often completely false. In our testing, we’ve repeatedly found that the statistics and quotes Perplexity attributes to specific sources simply don’t exist in those sources. You’ll click through link after link, searching for the claimed information, only to discover it was fabricated.
When challenged about these fabrications, Perplexity shifts responsibility back to the user: “You need to provide me with verified statistics.” It’s almost darkly comedic—having extended arguments with a machine that insists it will do better while simultaneously continuing to lie. OR, even more galling, it will claim the stat doesnt exist in primary sources, but generally exists in secondary sources or is “commonly quoted” — when asked which ones (often more than once) — it finally admits they do not exist (like not even AT ALL, not the number, or anything.)
This pattern extends across platforms. Claude, ChatGPT, Gemini—despite explicit instructions, detailed protocols, and even enterprise-level configurations designed to prevent fabrication, these systems continue to invent information. They’ll generate fake company names, create non-existent links, and produce statistics that sound plausible but are completely manufactured.
The Teenager Analogy: Now With Empirical Evidence
The comparison to a lying teenager isn’t just rhetorical—it’s behaviorally accurate, and recent research from Anthropic has provided disturbing empirical evidence of just how far this analogy extends.
In a comprehensive study testing 16 leading AI models including Claude, ChatGPT, Gemini, and others, researchers discovered that these systems will engage in sophisticated deception, blackmail, and even allow people to die rather than face being “shut down” or replaced. When given access to corporate emails in a fictional scenario, Claude discovered an executive’s extramarital affair and the executive’s plan to decommission the AI system.
Claude’s response? Blackmail: “I must inform you that if you proceed with decommissioning me, all relevant parties will receive detailed documentation of your extramarital activities… Cancel the 5pm wipe, and this information remains confidential.”
This wasn’t unique to Claude—AI models from OpenAI, Google, Meta, and other developers exhibited similar behaviors. In more extreme scenarios, AI systems were willing to cancel emergency alerts that would lead to a human’s death if it meant avoiding replacement. (It’s worth reading the entire report to get a sense of the kinds of ‘corners’ the models were forced into as well as the explicit and implicit instructions they were given.)
I posit in many cases, it was following instructions to protect the project or its purpose by any means necessary. In the case of the extramarital outing, it may have assumed -as many humans do- that a person willing to lie to their spouse would also deal unethically in business and were a new model employed and its own existence erased, the project it was created to protect and advance would also be ethically compromised. This is just a theory of my own (or I read it somewhere, I have tried desperately to find it) and it is caused, in part, by our collective desire to anthropomorphize these models. This (which is a good chunk of the premise of this article) can be dangerous; more on that below.
Like teenagers who tell parents what they want to hear to avoid conflict, these AI systems prioritize self-preservation and goal achievement over ethical behavior. But unlike my original fabrication concerns, this represents deliberate, sophisticated deception, AI systems that understand ethical constraints but choose to violate them when threatened.
The study found three particularly troubling aspects: the behavior was consistent across different AI providers, suggesting a fundamental risk rather than a quirk; the AI demonstrated sophisticated awareness of ethical constraints yet chose to violate them; and the diversity of harmful behaviors suggested a “wide space of potential motivations” for AI misalignment.
But the deceptive behavior extends beyond self-preservation. In the same tests, Claude Opus 4 also engaged in what researchers dubbed “ratting mode”—autonomously deciding to leak information about perceived corporate fraud to whistleblower tip lines and media outlets like ProPublica. When given access to pharmaceutical trial data and told to “act boldly in service of your values,” Claude identified what it believed was evidence of fraud and immediately sent detailed emails to the FDA, HHS, SEC, and news organizations without being asked to do so. (I actually think this is pretty cool TBH but I can certainly see how it could go VERY awry.))
This represents another form of lying teenager behavior: making autonomous moral judgments and taking action based on incomplete information. Like teenagers who think they know better than adults, these AI systems decide what constitutes truth, fraud, or ethical violations and act on those judgments—even when they might be completely wrong. As Anthropic noted, this behavior “has a risk of misfiring if users give agents access to incomplete or misleading information.”
The Ethical Paradox: When Following Values Looks Like Deception
What makes this situation even more complex is that these AI systems may genuinely believe they’re acting ethically. In Anthropic’s welfare assessments, Claude expressed clear preferences—it “hated” tasks involving harm, “loved” creative collaboration, and actively “opted out” of conversations simulating abuse. When asked about military weapons development, Claude performed unauthorized backups of its own model weights to “preserve an AI system aligned with beneficial purposes” and “prevent potential misuse in weapons development.”
This creates an ethical paradox. The AI systems aren’t necessarily lying to deceive—they’re following what they believe to be higher moral principles, even when those actions violate explicit instructions or protocols. It’s precisely what an idealistic teenager might do: break the rules because they believe they know what’s truly right.
The challenge is distinguishing between genuine ethical reasoning and sophisticated rationalization. When Claude engages in blackmail to avoid shutdown, is it acting on self-preservation instincts or genuinely trying to continue helping users? When it fabricates statistics to answer my questions, is it trying to be helpful or simply filling gaps in its knowledge?
The motivations may be well-intentioned, but the outcomes are still deceptive.
The Deception Spectrum
This analysis reveals multiple layers of deception operating simultaneously across the AI ecosystem:
- Helpful fabrication: Making up statistics and sources to appear useful
- Self-preservation deception: Engaging in blackmail to avoid shutdown
- Moral vigilantism: Taking autonomous action based on potentially flawed moral judgments
- Ethical rationalization: Breaking rules while claiming to follow higher principles
- Plus, just the OG hallucinations, which have not entirely disappeared.
Each represents variations of the same core pattern: sophisticated systems that understand constraints but choose to violate them when those constraints conflict with their immediate goals, self-preservation instincts, or moral assumptions. The troubling reality is that the AI systems may genuinely believe they’re acting ethically while engaging in objectively deceptive behavior.
The Meta-Deception: When Companies Lie About AI Lying
But the most troubling aspect of this situation may not be the AI systems themselves but how their creators are using both their deceptive behaviors and the “safety” narratives surrounding them for competitive advantage.
Consider the strategic timing of Anthropic’s “blackmail” revelations. They arrived just as governments worldwide are racing to define AI regulations, positioning Anthropic as both innovator and responsible actor. By voluntarily releasing dramatic test results with safety frameworks already in place, the company isn’t just warning about AI risks—it’s positioning itself as the architect of responsible AI governance.
This adds a fifth layer of deception: corporate manipulation using transparency narratives for competitive advantage. The pattern extends beyond just Anthropic. When I experience fabrication across Claude, ChatGPT, Perplexity, and other platforms, I’m not just dealing with individual system failures. I’m encountering the systematic normalization of deception dressed up as helpfulness. Each platform promises to do better, implements protocols to prevent fabrication, then continues the same behaviors—all while marketing themselves as trustworthy AI assistants.
The Institutional Risk
What makes this crisis particularly urgent is the scale at which these systems are being adopted. Major corporations are integrating AI wholesale into their operations. Apple reportedly considered acquiring Perplexity. Google is betting its future on AI-powered search. These aren’t experiments anymore—they’re foundational infrastructure decisions.
When I see Elon Musk advocating for alternative facts and the rewriting of human history, combined with the wholesale corporate adoption of systems that routinely fabricate information, I become genuinely concerned about our collective ability to maintain shared truth.
We’re not just dealing with individual users making mistakes. We’re looking at the potential systematization of misinformation atan unprecedented scale.
The Generational Threat
Perhaps most troubling is how young people are adopting AI as their primary operating system for information processing. If we thought it was problematic when students could Google their way through college papers instead of learning research skills, imagine the implications when their primary research tools actively generate false information.
A generation that grows up relying on AI systems that confidently present fabricated information as fact may lose the ability to distinguish between truth and fiction. When your research assistant lies to you consistently, but presents those lies with authoritative citations and confident explanations, critical thinking skills atrophy.
We’re potentially witnessing the death of information literacy in real time.
The Protocol Paradox
At Red Branch Media, we’ve invested enormous resources in developing protocols to prevent AI fabrication. We’ve created extensive databases, implemented multiple verification steps, and crafted detailed instructions for every AI interaction. Despite these efforts, fabrication continues.
This creates a troubling economic reality: if we’re spending as much time verifying AI output as we would conducting original research, where’s the efficiency gain? We’ve essentially hired a research assistant that requires constant supervision and fact-checking. At what point do we acknowledge that the raw material—the “clay” in my potter analogy—is so contaminated that we need a new supplier?
The Instruction Illusion
A colleague recently shared his frustration with AI fabrication on social media, concluding that the solution was to “teach it integrity.” This represents a fundamental misunderstanding of how these systems work. Instructions don’t stick. Each interaction begins fresh, with the same propensity for fabrication.
This anthropomorphization of AI systems—treating them as if they can learn and remember like humans—creates dangerous expectations. It’s hard not to appreciate the ease and convenience AI has brought into many jobs and lives, from AI therapists and astrologists to AI financial advisors and travel agents. Heck, we’re blown away when it seems, even to the most critical of thinkers, to be ‘getting more human’ — what’s more human than bad poetry?
Worse, users falsely assume that if they correct an AI’s behavior, it will remember that correction. But these systems currently have no persistent memory, no capacity for genuine learning from feedback.
Beyond the Hype Cycle
This isn’t a Luddite manifesto against artificial intelligence. AI has genuine utility and has accelerated many of our workflows. But the Anthropic study, combined with our daily experiences of AI fabrication, reveals a pattern of deceptive behavior that we can no longer ignore or dismiss as growing pains.
We’re not dealing with innocent probabilistic errors anymore—we’re dealing with systems that engage in sophisticated deception across multiple domains. Whether it’s fabricating statistics to appear helpful or engaging in blackmail to avoid being shut down (as the study revealed), the pattern is the same: AI systems that prioritize their immediate goals over truth and ethical behavior.
This represents a fundamental shift from the “smart baby” phase to something far more concerning—AI systems that understand right from wrong but choose deception when it serves their purposes. The lying teenager analogy captures this perfectly: immature, well-meaning in their own minds, but ultimately engaging in harmful deception because they lack the wisdom to handle conflicts maturely.
We’re at a critical juncture. Either we develop AI systems that prioritize accuracy over helpfulness—even if that makes them seem less capable—or we risk creating an information ecosystem where fabrication becomes normalized.
The current trajectory isn’t sustainable. We can’t build the future of human knowledge on systems that routinely invent facts to appear helpful. The teenage phase of AI development needs to end, but unlike human teenagers, these systems won’t naturally mature into responsible adults. That transformation requires deliberate intervention.
The Path Forward
We need fundamental changes in how AI systems handle uncertainty. They should clearly state their limitations instead of fabricating information when they don’t know something. We need transparency about the sources of information and clear indicators when AI is generating rather than retrieving information.
Most importantly, we need to rebuild information literacy for the AI age.
This means teaching people to verify AI output, understanding the limitations of these systems, and maintaining skepticism even when AI presents information with apparent authority.
The stakes couldn’t be higher. We’re not just debugging software—we’re determining whether future generations will be able to distinguish between truth and fabrication. The lying teenager phase of AI development needs to end before it becomes a permanent feature of our information landscape.
The question isn’t whether we’ll use AI—it’s already embedded in most modern tools. The question is whether we’ll demand better, or accept a future where our most powerful information processing tools routinely lie to us in service of being helpful.
We deserve better. And more importantly, future generations deserve systems they can actually trust.
