{"id":6694,"date":"2026-06-11T00:55:58","date_gmt":"2026-06-11T00:55:58","guid":{"rendered":"https:\/\/www.fintechpulse8.com\/?p=6694"},"modified":"2026-06-11T00:55:58","modified_gmt":"2026-06-11T00:55:58","slug":"anthropic-accused-of-secret-sabotage-as-claude-fable-5-silently-limits-ai-research-capabilities","status":"publish","type":"post","link":"https:\/\/www.fintechpulse8.com\/?p=6694","title":{"rendered":"Anthropic accused of \u2018secret sabotage\u2019 as Claude Fable 5 silently limits AI research capabilities"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/fortune.com\/img-assets\/wp-content\/uploads\/2026\/06\/GettyImages-2280239395.jpg?w=2048\" \/><\/p>\n<p>When Anthropic made its first Mythos-tier model available to the general public yesterday, called Claude Fable 5, <em>Fortune<\/em> reported it was a \u201cconsiderable step\u201d for the lab, coming just over a week after the company confidentially filed for IPO paperwork. It had initially deemed Mythos-class models too dangerous to release, citing their significantly enhanced ability to identify software vulnerabilities, but said it was now confident new guardrails in Claude Fable 5 are enough to ensure these dangerous skills don\u2019t fall into the wrong hands.<\/p>\n<div>\n<p>Just hours after the model\u2019s release, however, major backlash from AI researchers, developers, and policy experts began brewing on social media. The pushback centered around a paragraph buried in Claude Fable 5\u2019s 319-page system card\u2014a document that offers detailed safety disclosures\u2014which revealed that Fable would quietly downgrade its own responses when it detected requests related to cutting-edge AI development work, such as building the infrastructure used to train large AI models.<\/p>\n<p>In practice, that means a user could ask Fable for help, receive a deliberately weakened answer, but not know the model was holding anything back. Critics made it clear they felt this undermined a basic expectation that a tool would either do what it was asked or tell the user it wouldn\u2019t.<\/p>\n<p>Unlike Fable\u2019s other restrictions, such as around cybersecurity and biology, which openly redirect users to a less powerful model with a visible notification, the system card emphasized that this is \u201cnot visible to the user.\u201d The model still responds, but uses \u201cinterventions to limit Claude\u2019s effectiveness\u201d without telling the user it\u2019s doing so.<\/p>\n<p>Anthropic estimated the restrictions would affect roughly 0.03% of traffic. But it also defended its effort by saying \u201cenforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.\u201d\u00a0<\/p>\n<h2 class=\"wp-block-heading\">Pushback from AI community<\/h2>\n<p>A wide swath of the AI community pushed back sharply\u2014including open-source researchers critical of Anthropic\u2019s closed policies, as well as AI safety experts who typically align with Anthropic.<\/p>\n<p>\u201cTo have my access to the cutting edge models for my work rug pulled in an under the table fashion is appalling,\u201d wrote Nathan Lambert, an open-model researcher who most recently led work at AI2. \u201cTo me this paints Anthropic clearly as anti-science, and therefore anti-progress and anti-safety.\u201d\u00a0<\/p>\n<p>Dean Ball, a senior fellow at the Foundation for American Innovation who previously served as senior policy advisor at the White House Office of Science and Technology Policy, wrote that Anthropic\u2019s \u201csecret sabotage\u201d safety policy \u201cmassively and profoundly raises the status of the argument that AI safety has been hype to justify monopolistic behavior by labs.\u201d\u00a0<\/p>\n<p>And Jeremy Howard, head of nonprofit research group Fast AI, wrote that \u201cAnthropic has chosen the opposite of the safe path: they are allowing themselves, the current top lab, to use their top model for frontier AI research. They\u2019ve said they\u2019ll sabotage others who try. This means the AI frontier advances, &amp; power imbalance increases.\u201d\u00a0 <\/p>\n<p>Even former Anthropic employees joined in. Behnam Neyshabur, who previously co-led Anthropic\u2019s effort to develop an AI scientist, posted on X saying: \u201cWorking on AI for cancer? Sorry, I can\u2019t help you. Working on AI for Alzheimer\u2019s Disease? Sorry, I\u2019m becoming a bit dumb when it comes to the AI part of it.\u201d In another post, he added: \u201cI\u2019ve argued for the last eight months that this was the direction things were heading. In my view, concentrating these capabilities fundamentally slows scientific and technological progress and is net negative for humanity.\u201d<\/p>\n<p>Not all prominent AI voices weighed in with criticism, however. Ethan Mollick, an associate professor at Wharton studying AI, innovation, and entrepreneurship, did not focus on the restrictions, writing in a blog post that Claude Fable 5 \u201coutperformed basically every other public model I have used by a considerable margin.\u201d\u00a0<\/p>\n<p>Former OpenAI cofounder and Tesla AI director Andrej Karpathy, who announced he had joined Anthropic last month, called Claude Fable 5 a \u201csuper exciting release\u201d on X and said it is a \u201cmajor-version-bump-deserving step change forward.\u201d He did, however, point out that the model \u201cstill has quirks that people will run into and the safeguards are configured to be a little too trigger-happy for launch, which can hopefully be tuned over time.\u201d\u00a0\u00a0<\/p>\n<h2 class=\"wp-block-heading\">Anthropic says it wants to make models accessible and safe<\/h2>\n<p>Before the release, Anthropic seemed to gird itself for backlash, though it did not specifically address potential blowback regarding the research restrictions. In an interview with <em>Fortune<\/em> yesterday, Dianne Na Penn, Anthropic\u2019s head of product management, research, and labs, said that the new model was able to produce frontier performance that was 10 to 20 points more than its previous model, Opus 4.8 or other frontier models.<\/p>\n<p>\u201cI think generally being able to do that, at the same time having the right guardrails in place to make it accessible, and generally in a safe manner, I think that\u2019s probably the main thing that I want folks to take away,\u201d she said. \u201cWe\u2019re raising the bar on the intelligence of the models, and at the same time, we are pushing the frontier in a safe manner.\u201d\u00a0<\/p>\n<p>She added that Anthropic recognized that some benign requests would initially be blocked. \u201cWe\u2019re working actively on making those safeguard improvements post-launch, but we wanted to make the model accessible generally in a safe manner as soon as we could.\u201d<\/p>\n<p>Anthropic did not respond to <em>Fortune\u2019<\/em>s request for comment.<\/p>\n<\/div>\n<p>#Anthropic #accused #secret #sabotage #Claude #Fable #silently #limits #research #capabilities<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When Anthropic made its first Mythos-tier model available to the general public yesterday, called Claude Fable 5, Fortune reported it was a \u201cconsiderable step\u201d for the lab, coming just over&hellip; <\/p>\n","protected":false},"author":1,"featured_media":6695,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[691,536,9106,6853,9104,6211,195,2538,2510,9105],"class_list":["post-6694","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-finance-news","tag-accused","tag-anthropic","tag-capabilities","tag-claude","tag-fable","tag-limits","tag-research","tag-sabotage","tag-secret","tag-silently"],"_links":{"self":[{"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=\/wp\/v2\/posts\/6694","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6694"}],"version-history":[{"count":0,"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=\/wp\/v2\/posts\/6694\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=\/wp\/v2\/media\/6695"}],"wp:attachment":[{"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6694"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6694"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fintechpulse8.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6694"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}