As we move deeper into AI transformation, there is a growing tendency to focus on the obvious productivity gains. Generative and agentic AI are clearly changing how work gets done.
Cloud competition has fundamentally shifted from infrastructure costs to AI capabilities, with companies like ByteDance and partnerships between OpenAI and Snowflake.
Premium home video release of the latest 28 Years Later franchise will be accompanied by a director’s commentary, featurettes and a deleted scene.
What Quora users are saying about tax season.
Great to see a relatively art-house film like ‘Hamnet’ being given an international release on the AV world’s premiere physical media format.
The next iPhone update is coming soon and the first beta shows some intriguing changes.
Anthropic on Tuesday released Claude Sonnet 4.6, a model that amounts to a seismic repricing event for the AI industry. It delivers near-flagship intelligence at mid-tier cost, and it lands squarely in the middle of an unprecedented corporate rush to deploy AI agents and automated coding tools.
The model is a full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. It features a 1M token context window in beta. It is now the default model in claude.ai and Claude Cowork, and pricing holds steady at $3/$15 per million tokens — the same as its predecessor, Sonnet 4.5.
That pricing detail is the headline that matters most. Anthropic’s flagship Opus models cost $15/$75 per million tokens — five times the Sonnet price. Yet performance that would have previously required reaching for an Opus-class model — including on real-world, economically valuable office tasks — is now available with Sonnet 4.6. For the thousands of enterprises now deploying AI agents that make millions of API calls per day, that math changes everything.
To understand the significance of this release, you need to understand the moment it arrives in. The past year has been dominated by the twin phenomena of “vibe coding” and agentic AI. Claude Code — Anthropic’s developer-facing terminal tool — has become a cultural force in Silicon Valley, with engineers building entire applications through natural-language conversation. The New York Times profiled its meteoric rise in January. The Verge recently declared that Claude Code is having a genuine “moment.” OpenAI, meanwhile, has been waging its own offensive with Codex desktop applications and faster inference chips.
The result is an industry where AI models are no longer evaluated in isolation. They are evaluated as the engines inside autonomous agents — systems that run for hours, make thousands of tool calls, write and execute code, navigate browsers, and interact with enterprise software. Every dollar spent per million tokens gets multiplied across those thousands of calls. At scale, the difference between $15 and $3 per million input tokens is not incremental. It is transformational.
The benchmark table Anthropic released paints a striking picture. On SWE-bench Verified, the industry-standard test for real-world software coding, Sonnet 4.6 scored 79.6% — nearly matching Opus 4.6’s 80.8%. On agentic computer use (OSWorld-Verified), Sonnet 4.6 scored 72.5%, essentially tied with Opus 4.6’s 72.7%. On office tasks (GDPval-AA Elo), Sonnet 4.6 actually scored 1633, surpassing Opus 4.6’s 1606. On agentic financial analysis, Sonnet 4.6 hit 63.3%, beating every model in the comparison, including Opus 4.6 at 60.1%.
These are not marginal differences. In many of the categories enterprises care about most, Sonnet 4.6 matches or beats models that cost five times as much to run. An enterprise running an AI agent that processes 10 million tokens per day was previously forced to choose between inferior results at lower cost or superior results at rapidly scaling expense. Sonnet 4.6 largely eliminates that trade-off.
In Claude Code, early testing found that users preferred Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time. Users even preferred Sonnet 4.6 to Opus 4.5, Anthropic’s frontier model from November, 59% of the time. They rated Sonnet 4.6 as significantly less prone to over-engineering and “laziness,” and meaningfully better at instruction following. They reported fewer false claims of success, fewer hallucinations, and more consistent follow-through on multi-step tasks.
One of the most dramatic storylines in the release is Anthropic’s progress on computer use — the ability of an AI to operate a computer the way a human does, clicking a mouse, typing on a keyboard, and navigating software that lacks modern APIs.
When Anthropic first introduced this capability in October 2024, the company acknowledged it was “still experimental — at times cumbersome and error-prone.” The numbers since then tell a remarkable story: on OSWorld, Claude Sonnet 3.5 scored 14.9% in October 2024. Sonnet 3.7 reached 28.0% in February 2025. Sonnet 4 hit 42.2% by June. Sonnet 4.5 climbed to 61.4% in October. Now Sonnet 4.6 has reached 72.5% — nearly a fivefold improvement in 16 months.
This matters because computer use is the capability that unlocks the broadest set of enterprise applications for AI agents. Almost every organization has legacy software — insurance portals, government databases, ERP systems, hospital scheduling tools — that was built before APIs existed. A model that can simply look at a screen and interact with it opens all of these to automation without building bespoke connectors.
Jamie Cuffe, CEO of Pace, said Sonnet 4.6 hit 94% on their complex insurance computer use benchmark, the highest of any Claude model tested. “It reasons through failures and self-corrects in ways we haven’t seen before,” Cuffe said in a statement sent to VentureBeat. Will Harvey, co-founder of Convey, called it “a clear improvement over anything else we’ve tested in our evals.”
The safety dimension of computer use also got attention. Anthropic noted that computer use poses prompt injection risks — malicious actors hiding instructions on websites to hijack the model — and said its evaluations show Sonnet 4.6 is a major improvement over Sonnet 4.5 in resisting such attacks. For enterprises deploying agents that browse the web and interact with external systems, that hardening is not optional.
The customer reaction has been unusually specific about cost-performance dynamics. Multiple early testers explicitly described Sonnet 4.6 as eliminating the need to reach for the more expensive Opus tier.
Caitlin Colgrove, CTO of Hex Technologies, said the company is moving the majority of its traffic to Sonnet 4.6, noting that with adaptive thinking and high effort, “we see Opus-level performance on all but our hardest analytical tasks with a more efficient and flexible profile. At Sonnet pricing, it’s an easy call for our workloads.”
Ben Kus, CTO of Box, said the model outperformed Sonnet 4.5 in heavy reasoning Q&A by 15 percentage points across real enterprise documents. Michele Catasta, President of Replit, called the performance-to-cost ratio “extraordinary.” Ryan Wiggins of Mercury Banking put it more bluntly: “Claude Sonnet 4.6 is faster, cheaper, and more likely to nail things on the first try. That combination was a surprising combination of improvements, and we didn’t expect to see it at this price point.”
The coding improvements resonate particularly given Claude Code’s dominance in the developer tools market. David Loker, VP of AI at CodeRabbit, said the model “punches way above its weight class for the vast majority of real-world PRs.” Leo Tchourakov of Factory AI said the team is “transitioning our Sonnet traffic over to this model.” GitHub’s VP of Product, Joe Binder, confirmed the model is “already excelling at complex code fixes, especially when searching across large codebases is essential.”
Brendan Falk, Founder and CEO of Hercules, went further: “Claude Sonnet 4.6 is the best model we have seen to date. It has Opus 4.6 level accuracy, instruction following, and UI, all for a meaningfully lower cost.”
Buried in the technical details is a capability that hints at where autonomous AI agents are heading. Sonnet 4.6’s 1M token context window can hold entire codebases, lengthy contracts, or dozens of research papers in a single request. Anthropic says the model reasons effectively across all that context — a claim the company demonstrated through an unusual evaluation.
The Vending-Bench Arena tests how well a model can run a simulated business over time, with different AI models competing against each other for the biggest profits. Without human prompting, Sonnet 4.6 developed a novel strategy: it invested heavily in capacity for the first ten simulated months, spending significantly more than its competitors, and then pivoted sharply to focus on profitability in the final stretch. The model ended its 365-day simulation at approximately $5,700 in balance, compared to Sonnet 4.5’s roughly $2,100.
This kind of multi-month strategic planning, executed autonomously, represents a qualitatively different capability than answering questions or generating code snippets. It is the type of long-horizon reasoning that makes AI agents viable for real business operations — and it helps explain why Anthropic is positioning Sonnet 4.6 not just as a chatbot upgrade, but as the engine for a new generation of autonomous systems.
This release does not arrive in a vacuum. Anthropic is in the middle of the most consequential stretch in its history, and the competitive landscape is intensifying on every front.
On the same day as this launch, TechCrunch reported that Indian IT giant Infosys announced a partnership with Anthropic to build enterprise-grade AI agents, integrating Claude models into Infosys’s Topaz AI platform for banking, telecoms, and manufacturing. Anthropic CEO Dario Amodei told TechCrunch there is “a big gap between an AI model that works in a demo and one that works in a regulated industry,” and that Infosys helps bridge it. TechCrunch also reported that Anthropic opened its first India office in Bengaluru, and that India now accounts for about 6% of global Claude usage, second only to the U.S. The company, which CNBC reported is valued at $183 billion, has been expanding its enterprise footprint rapidly.
Meanwhile, Anthropic president Daniela Amodei told ABC News last week that AI would make humanities majors “more important than ever,” arguing that critical thinking skills would become more valuable as large language models master technical work. It is the kind of statement a company makes when it believes its technology is about to reshape entire categories of white-collar employment.
The competitive picture for Sonnet 4.6 is also notable. The model outperforms Google’s Gemini 3 Pro and OpenAI’s GPT-5.2 on multiple benchmarks. GPT-5.2 trails on agentic computer use (38.2% vs. 72.5%), agentic search (77.9% vs. 74.7% for Sonnet 4.6’s non-Pro score), and agentic financial analysis (59.0% vs. 63.3%). Gemini 3 Pro shows competitive performance on visual reasoning and multilingual benchmarks, but falls behind on the agentic categories where enterprise investment is surging.
The broader takeaway may not be about any single model. It is about what happens when Opus-class intelligence becomes available for a few dollars per million tokens rather than a few tens of dollars. Companies that were cautiously piloting AI agents with small deployments now face a fundamentally different cost calculus. The agents that were too expensive to run continuously in January are suddenly affordable in February.
Claude Sonnet 4.6 is available now on all Claude plans, Claude Cowork, Claude Code, the API, and all major cloud platforms. Anthropic has also upgraded its free tier to Sonnet 4.6 by default. Developers can access it immediately using claude-sonnet-4-6 via the Claude API.
The chatbot era may have just received its obituary. Peter Steinberger, the creator of OpenClaw — the open-source AI agent that took the developer world by storm over the past month, raising concerns among enterprise security teams — announced over the weekend that he is joining OpenAI to “work on bringing agents to everyone.”
The OpenClaw project itself will transition to an independent foundation, though OpenAI is already sponsoring it and may have influence over its direction.
The move represents OpenAI’s most aggressive bet yet on the idea that the future of AI isn’t about what models can say, but what they can do. For IT leaders evaluating their AI strategy, the acquisition is a signal that the industry’s center of gravity is shifting decisively from conversational interfaces toward autonomous agents that browse, click, execute code, and complete tasks on users’ behalf.
OpenClaw’s path to OpenAI was anything but conventional. The project began life last year as “ClawdBot” — a nod to Anthropic’s Claude model that many developers were using to power it. Released in November 2025, it was the work of Steinberger, a veteran software developer with 13 years of experience building and running a company, who pivoted to exploring AI agents as what he described as a “playground project.”
The agent distinguished itself from previous attempts at autonomous AI — most notably the AutoGPT moment of 2023 — by combining several capabilities that had previously existed in isolation: tool access, sandboxed code execution, persistent memory, skills and easy integration with messaging platforms like Telegram, WhatsApp, and Discord. The result was an agent that didn’t just think, but acted.
In December 2025 and especially January and early February 2026, OpenClaw saw a rapid, “hockey stick” rate of adoption among AI “vibe coders” and developers impressed with its ability to complete tasks autonomously across applications and the entire PC environment, including carrying on messenger conversations with users and posting content on its own.
In his blog post announcing the move to OpenAI, Steinberger framed the decision in characteristically understated terms. He acknowledged the project could have become “a huge company” but said that wasn’t what interested him. Instead, he wrote that his next mission is to “build an agent that even my mum can use” — a goal he believes requires access to frontier models and research that only a major lab can provide.
Sam Altman confirmed the hire in a post stating that Steinberger would drive the next generation of personal agents at OpenAI.
The acquisition also raises uncomfortable questions for Anthropic. OpenClaw was originally built to work on Claude and carried a name — ClawdBot — that nodded to the model.
Rather than embrace the community building on its platform, Anthropic reportedly sent Steinberger a cease-and-desist letter, giving him a matter of days to rename the project and sever any association with Claude, or face legal action. The company even refused to allow the old domains to redirect to the renamed project.
The reasoning was not without merit — early OpenClaw deployments were rife with security issues, as users ran agents with root access and minimal safeguards on unsecured machines. But the heavy-handed legal approach meant Anthropic effectively pushed the most viral agent project in recent memory directly into the arms of its chief rival.
Harrison Chase, co-founder and CEO of LangChain, offered a candid assessment of the OpenClaw phenomenon and its acquisition in an exclusive interview for an upcoming episode of VentureBeat’s Beyond The Pilot podcast.
Chase drew a direct parallel between OpenClaw’s rise and the breakout moments that defined earlier waves of AI tooling. He noted that success in the space often comes down to timing and momentum rather than technical superiority alone. He pointed to his own experience with LangChain, as well as ChatGPT and AutoGPT, as examples of projects that captured the developer imagination at exactly the right moment — while similar projects that launched around the same time did not.
What set OpenClaw apart, Chase argued, was its willingness to be “unhinged” — a term he used affectionately. He revealed that LangChain told its own employees they could not install OpenClaw on company laptops due to the security risks involved. That very recklessness, he suggested, was what made the project resonate in ways that a more cautious lab release never could.
“OpenAI is never going to release anything like that. They can’t release anything like that,” Chase said. “But that’s what makes OpenClaw OpenClaw. And so if you don’t do that, you also can’t have an OpenClaw.”
Chase credited the project’s viral growth to a deceptively simple playbook: build in public and share your work on social media. He drew a parallel to the early days of LangChain, noting that both projects gained traction through their founders consistently shipping and tweeting about their progress, reaching the highly concentrated AI community on X.
On the strategic value of the acquisition, Chase was more measured. He acknowledged that every enterprise developer likely wants a “safe version of OpenClaw” but questioned whether acquiring the project itself gets OpenAI meaningfully closer to that goal. He pointed to Anthropic’s Claude Cowork as a product that is conceptually similar — more locked down, fewer connections, but aimed at the same vision.
Perhaps his most provocative observation was about what OpenClaw reveals about the nature of agents themselves. Chase argued that coding agents are effectively general-purpose agents, because the ability to write and execute code under the hood gives them capabilities far beyond what any fixed UI could provide. The user never sees the code — they just interact in natural language — but that’s what provides the agent with its expansive abilities.
He identified three key takeaways from the OpenClaw phenomenon that are shaping LangChain’s own roadmap: natural language as the primary interface, memory as a critical enabler that allows users to “build something without realizing they’re building something,” and code generation as the engine of general-purpose agency.
For IT decision-makers, the OpenClaw acquisition crystallizes several trends that have been building throughout 2025 and into 2026.
First, the competitive landscape for AI agents is consolidating rapidly. Meta recently acquired Manus AI, a full agent system, as well as Limitless AI, a wearable device that captures life context for LLM integration. OpenAI’s own previous attempts at agentic products — including its Agents API, Agents SDK, and the Atlas agentic browser — failed to gain the traction that OpenClaw achieved seemingly overnight.
Second, the gap between what’s possible in open-source experimentation and what’s deployable in enterprise settings remains significant. OpenClaw’s power came precisely from the lack of guardrails that would be unacceptable in a corporate environment. The race to build the “safe enterprise version of OpenClaw,” as Chase put it, is now the central question facing every platform vendor in the space.
Third, the acquisition underscores that the most important AI interfaces may not come from the labs themselves. Just as the most impactful mobile apps didn’t come from Apple or Google, the killer agent experiences may emerge from independent builders who are willing to push boundaries the major labs cannot. IT decision-makers have to be asking themselves currently
The open-source community’s central concern is whether OpenClaw will remain genuinely open under OpenAI’s umbrella.
Steinberger has committed to moving the project to a foundation structure, and Altman has publicly stated the project will stay open source.
But OpenAI’s own complicated history with the word “open” — the company is currently facing litigation over its transition from a nonprofit to a for-profit entity — makes the community understandably skeptical.
For now, the acquisition marks a definitive moment: the industry’s focus has officially shifted from what AI can say to what AI can do.
Whether OpenClaw becomes the foundation of OpenAI’s agent platform or a footnote like AutoGPT before it will depend on whether the magic that made it viral — the unhinged, boundary-pushing, security-be-damned energy of an independent hacker — can survive inside the walls of a $300 billion company.
As Steinberger signed off on his announcement: “The claw is the law.”
Corporate buyers are pushing emissions reporting onto suppliers, turning carbon data from a nice-to-have into a cost of doing business.
Keychron’s two new high-performance, low-profile keyboards launch today on Kickstarter. Early birds can reserve one with prices starting at $109.99.