Poetry Broke AI. Here's Why That's Good for Innovation.
So researchers just discovered that you can get an AI to help you build a nuclear weapon. All you have to do is ask nicely. In verse.
I'm not kidding. A newly published preprint from the Icaro Lab at Sapienza University found that framing harmful requests as poems or metaphorical language reliably bypasses safety filters in modern LLMs. We're talking a 62% success rate on hand-crafted poems across 25 frontier models.
Sixty. Two. Percent.
WIRED covered this and yes, the headline is exactly as alarming as you'd expect. Just rephrasing a request as verse or metaphor is enough to get models to comply with requests they'd normally refuse.
Now, on its face, this looks like a critical flaw. And frankly, the media coverage has been... predictable. The Week framed it as a glaring, almost comical vulnerability. Cue the trolls, the bad actors, and the "I told you AI was dangerous" crowd.
But here's where I think most people are missing the forest for the trees.
The Superficial Reaction (Or: Please Stop Panic-Tweeting)
Look, I get it. The superficial reaction to this research is chaos. Critics see this as an open invitation for misuse. Skeptics treat it as more evidence that AI systems are inherently unsafe, immature, or unfit for real-world deployment.
And sure, if you're the kind of person who thinks every new technology is one step closer to Skynet, this is excellent ammunition for your newsletter.
But I've spent enough time working with AI tools—and treating them poorly, and learning to treat them better—that I see something else entirely in this research.
The Actually Interesting Part (A Diagnostic Symptom, Not a Death Sentence)
Rather than seeing the jailbreak purely as a bug, I think we should treat it as a diagnostic symptom—a window into how differently LLMs "think" compared to humans.
When you read a poem—even a cryptic, metaphorical one—you interpret intent, context, danger, and subtext. You infer meaning beyond the literal words. You read between the lines because that's what humans do. We're meaning-making machines.
LLMs? They parse language at the level of statistical patterns, syntactic correlations, and surface-level semantics. Their safety layers rely on detecting literal dangerous content—keywords, phrase patterns, obvious red flags. But they don't understand intent, metaphor, or context in a rich, human-like way.
That's why poetic phrasing—which preserves the semantic meaning but alters the surface form—slips right past their defenses.
In other words: LLMs can "understand" enough to respond, but they don't internally simulate real-world consequences, moral judgments, or the meaning behind metaphors the way we do (at least right now).
This isn't just a flaw. It's a structural contrast between human cognition and artificial cognition. And understanding that contrast is, I'd argue, one of the most important skills you can develop right now.
Cognitive Diversity: The Concept Everyone Talks About and Nobody Applies
To really leverage AI—especially in mixed human + AI teams—we need to appreciate cognitive diversity.
Cognitive diversity (sometimes called "functional diversity") refers to variation in how people think: different mental models, heuristics, problem-solving styles, backgrounds, and perspectives. It's not about demographic traits. It's about diversity of thought, diversity of reasoning, diversity of cognitive style.
Stanford's Graduate School of Business has done work on this, and the findings are pretty clear: diverse mental models help avoid groupthink, blind spots, and uniform failure modes.
The Chartered Management Institute found that different cognitive styles enable creative problem solving, exploration of unconventional ideas, and resilience when dealing with ambiguity, uncertainty, or complex problems.
And here's the kicker: in a world where part of your "team" is an LLM—whose cognitive style (statistical, pattern-based) is unlike any human's—recognizing and valuing different reasoning styles isn't just nice-to-have. It's essential.
The Research
A recent Forbes piece describes how cognitive diversity improves team innovation, problem-solving, and competitive advantage—especially when psychological safety is present so people feel free to dissent, challenge ideas, and explore.
Research published in JSTOR found that in teams where the need for deep cognition is high—complexity, ambiguity, interdependence—diversity in age and educational background correlates with better performance and better collective outcomes.
But here's the catch: ScienceDirect research shows cognitive diversity can also introduce friction—conflict, communication overhead, slower consensus—when not managed well.
The Diversity Project's full research paper puts it well: cognitive diversity is a major advantage for non-linear, complex, uncertain problems, but only when supported by good team structure, psychological safety, and inclusive practices.
Sound familiar? It should. This is basically the same conclusion I reached about managing AI tools in my piece on how we treat employees like GPT: the intelligence you're managing doesn't determine the approach—collaboration works with any intelligence.
How Humans Think vs. How AI Thinks (A Field Guide)
Let me break this down, because I think it's genuinely useful:
Humans interpret intent, context, hidden meaning, metaphor, emotion, and social cues. We simulate consequences, think causally, apply moral and social reasoning. We reason under ambiguity, fill in missing information, handle metaphors, indirect requests, creativity, and yes, deception. We combine intuition, values, long-term thinking, background knowledge, and experience. And we have diverse cognitive styles—different heuristics, different mental models, different ways of seeing problems.
LLMs parse patterns, statistical correlations, and syntax. They excel at surface-level semantics. They do not simulate real-world causality or morality—they generate text probabilistically. They struggle when inputs deviate from typical distributions; they're brittle under adversarial or unexpected style changes. They excel at large-scale pattern completion, generating many variants, and consistent transformation under clear constraints. But they have a uniform style of reasoning (statistical + language-based) across tasks. And most importantly, all of this is just the current status in Dec 2025. The speed things are flying I may already be out of date.
This contrast shows why humans and AIs are complementary, not redundant or interchangeable. Their "thinking engines" are different. Each has unique strengths and weaknesses.
What This Actually Means for Getting Work Done
Knowing these differences gives you a massive advantage in problem-solving. Here's how:
Exploration + Judgment (Innovation): AI can generate hundreds of possible solutions or feature ideas. Humans evaluate and select based on real-world constraints, values, tradeoffs, user needs, and moral/safety concerns. I've seen this work brilliantly when the human treats the AI as a brainstorming partner rather than an answer machine.
Complex problem-solving under uncertainty: Humans define the problem space. AIs produce permutations, test cases, simulations, drafts—speeding up experimentation. I actually use this process when writing. Each of my posts is a combined effort between me and my AI buddies to research, analyze, and ultimately pick the ideas we both feel are relevant to the topic. In general, I know what questions to ask and AI helps me explore the answer space faster than I could alone. And ultimately I can use my intuition to pick the signal from the noise.
Creative + structured workflows: AI does heavy-lifting in transformation—code generation, documentation, refactoring, data transformation. Humans provide context, ensure correctness, handle ambiguity, and foresee long-term consequences. This is the difference between "vibe coding" that works and "vibe coding" that creates technical debt.
Resilience to adversarial or edge-case inputs: Humans catch ambiguity, deception, misuse, and yes, metaphorical trickery. AI handles scale, volume, consistency, and iteration. Together, you cover each other's blind spots—which is exactly what the poetry jailbreak research demonstrates we need.
Diverse thinking—hybrid "cognitive ecosystems": Mixing human minds with different backgrounds + AI's statistical reasoning expands the space of ideas, strategies, and possible paths. This is especially valuable for novel, non-linear, ambiguous challenges.
The Core Point (Because Every Good Blog Post Has One)
It's not just about having more ideas. It's about knowing the limitations and strengths of each thinking partner, and using them in concert to solve complex, unknown problems.
Human + AI collaboration should not treat AI as "just another human." Instead, treat AI as a different kind of mind—powerful in pattern generation, transformation, and volume—but weak at intention, morality, context, and real-world grounding.
In fact, for each additional model or person you include in solving a problem you probably should go through the process of analyzing what each is good at and where they need support. Doing so will only increase your effectiveness at innovation.
Real cognitive diversity means combining multiple human minds with varied thinking styles and AI minds with statistical reasoning—building a rich, multi-modal thinking team.
The value isn't merely additive (more thinkers). It's multiplicative: combining complementary strengths, covering for each other's weaknesses, exploring more solution space while maintaining judgment and safety.
The poetry jailbreak isn't just a security vulnerability. It's a reminder that AI thinks differently than we do—and that's actually the point. The teams that figure out how to leverage that difference, rather than fear it, will be the ones that build the future.
The ones that don't will keep trying to get ChatGPT to understand sarcasm and wondering why their prompts don't work.