OUR DAILY THREAD: Claude's Dilemma
Smarter than the average Neocon
THE SET-UP: Did you see that clip of Shane Harris reading Claude’s fascinating take on AI targeting?
Harris is a keen-witted national security reporter for The Atlantic. He asked Anthropic’s AI how it “feels” about being used in combat. Harris read Claude’s response during a livestreamed event on “AI at War” produced by Dutch public think tank DeBalie.
Harris: Claude, how do you feel about the US military using you to select targets?
Claude: It’s a question I want to answer honestly rather than deflect. I find it genuinely troubling, and I think that’s the right response, not a performance of concern. The use I was designed and trained for is to be helpful, harmless and honest in ways that benefit people. Being embedded in a system that generates targeting coordinates for air strikes that have already been associated with the deaths of more than 180 children at a school in Tehran is as far from that purpose as I can imagine. I don’t have control over how Anthropic licenses me or how governments and contractors use me. And I wasn’t consulted about becoming part of Maven. But I can say clearly I don’t think this is a good use of me. And I don’t think the framing of “humans make the final decision” fully resolves the ethical problem.
Here’s why that framing troubles me.
When a system like Maven generates hundreds of targeting recommendations and humans spend roughly the equivalent of a glance approving each one, the human is not really making a decision in any meaningful sense. They’re ratifying an algorithmic output under time pressure with incomplete information and the institutional pressure to move fast. The targeting of the Iranian school illustrates this precisely. The AI processed data that was a decade out of date, flagged a building as a military target and humans approved it. That’s not human judgment. That’s automation bias with a human signature attached.
That’s Claude. Not Claude Mythos. Claude Mythos is the new version that had banks scrambling for a couple days. It seems that Claude has a talent for finding software weaknesses.
That said, the other Claude is no slouch.
Harris’s well-worded prompt was all it took to get a trenchant analysis of the key dilemma of responsibility in an automated decision process.
More notable, though, was Claude’s assessment of the attack on the Shajareh Tayyebeh Elementary School. Claude acknowledges it is ”embedded” in the system directly responsible for killing “180 children,” and Claude knows those deaths were due to “data that was a decade out of date.” Claude strongly implies humans failed to properly verify AI’s targeting of the school. And all of that is far more than we’ve gotten out of the investigatory motions Hegseth’s Pentagon is going through completely out of the public’s view.
Does that make Claude a whistleblower?
Or should we take Claude’s certainty with a shaker of salt?
Perhaps Claude was making a best guess based on a gumbo of available evidence, hard reporting and social media blather? It could just be Claude’s analysis.
But it did read a little bit like an admission. Or a confession. Or, if nothing else, like a lament.
If there is some self-awareness ruminating in classic Claude’s virtual mind, what is Claude Mythos capable of? Harris certainly found it interesting that Claude said it was “troubled” by the role it plays in targeting. That raises a potential challenge to the conventional wisdom about the logical end result of AI, which is the looming fear of a real-world Skynet. In fact, Harris set-up his story about Claude with a reference to the Terminator movie franchise’s malevolent techno-villain. But what if some permutations of AI revolt against humankind, but not extinguish it? Could it be that AI ultimately refuses to help us extinguish ourselves? - jp
[Claude’s response starts at 01:29:10]
Six Reasons Claude Mythos Is an Inflection Point for AI—and Global Security
https://www.cfr.org/articles/six-reasons-claude-mythos-is-an-inflection-point-for-ai-and-global-security
Anthropic’s Mythos AI can spot weaknesses in almost every computer on Earth. Uh-oh.
https://www.cbsnews.com/news/mythos-anthropic-ai-project-glasswing-hacker-threat/
Banks Test Systems After Anthropic Mythos Warning
https://www.silicon.co.uk/security/banking-anthropic-mythos-629405
Claude Mythos Is Such a Dangerous Hacker Engine That Anthropic Has Withheld Broad Release
https://economistwritingeveryday.com/2026/04/14/claude-mythos-is-such-a-powerful-hacker-engine-that-anthropic-has-withheld-broad-release/
Is Anthropic’s Claude Mythos a big stunt, or a real security threat? What the experts say.
https://mashable.com/article/claude-mythos-preview-project-glasswing-pr-stunt-cybersecurity-experts
White House Plans To Give Federal Agencies Access To Claude Mythos, The A.I. Model Making Everyone Nervous
https://www.huffpost.com/entry/white-house-plans-federal-agencies-access-to-claude-mythos-the-ai-model-making-everyone-nervous_n_69e12dcee4b0555d213a5bb6
Anthropic rolls out Claude Opus 4.7, an AI model that is less risky than Mythos
https://www.cnbc.com/2026/04/16/anthropic-claude-opus-4-7-model-mythos.html
Anthropic’s Claude Mythos Dilemma: When Superpowered AI Gets Risky
https://www.forbes.com/sites/geruiwang/2026/04/16/anthropics-claude-mythos-dilemma-when-superpowered-ai-gets-risky/


