Nvidia’s agentic AI stack is the first major platform to ship with security at launch, but governance gaps remain

For the first time on a major AI platform release, security shipped at launch — not bolted on 18 months later. At Nvidia GTC this week, five security vendors announced protection for Nvidia’s agentic AI stack, four with active deployments, one with validated early integration.

The timing reflects how fast the threat has moved: 48% of cybersecurity professionals rank agentic AI as the top attack vector heading into 2026. Only 29% of organizations feel fully ready to deploy these technologies securely. Machine identities outnumber human employees 82 to 1 in the average enterprise. And IBM’s 2026 X-Force Threat Intelligence Index documented a 44% surge in attacks exploiting public-facing applications, accelerated by AI-enabled vulnerability scanning.

Nvidia CEO Jensen Huang made the case from the GTC keynote stage on Monday: “Agentic systems in the corporate network can access sensitive information, execute code, and communicate externally. Obviously, this can’t possibly be allowed.”

Nvidia defined a unified threat model designed to flex and adapt for the unique strengths of five different vendors. Nvidia also names Google, Microsoft Security and TrendAI as Nvidia OpenShell security collaborators. This article maps the five vendors with embargoed GTC announcements and verifiable deployment commitments on record, an analyst-synthesized reference architecture, not Nvidia’s official canonical stack.

No single vendor covers all five governance layers. Security leaders can evaluate CrowdStrike for agent decisions and identity, Palo Alto Networks for cloud runtime, JFrog for supply chain provenance, Cisco for prompt-layer inspection, and WWT for pre-production validation. The audit matrix below maps who covers what. Three or more unanswered vendor questions mean ungoverned agents in production.

The five-layer governance framework

This framework draws from the five vendor announcements and the OWASP Agentic Top 10. The left column is the governance layer. The right column is the question every security leader’s vendor should answer. If they can’t answer it, that layer is ungoverned.

Governance Layer

What To Deploy

Risk If Not

Vendor Question

Who Maps Here

Agent Decisions

Real-time guardrails on every prompt, response, and action

Poisoned input triggers privileged action

Detect state drift across sessions?

CrowdStrike Falcon AIDR, Cisco AI Defense [runtime enforcement]

Local Execution

Behavioral monitoring for on-device agents

Local agent runs unprotected

Agent baselines beyond process monitoring?

CrowdStrike Falcon Endpoint [runtime enforcement]; WWT ARMOR [pre-prod validation]

Cloud Ops

Runtime enforcement across cloud deployments

Agent-to-agent privilege escalation

Trust policies between agents?

CrowdStrike Falcon Cloud Security [runtime enforcement]; Palo Alto Prisma AIRS [AI Factory validated design]

Identity

Scoped privileges per agent identity

Inherited creds; delegation compounds

Privilege inheritance in delegation?

CrowdStrike Falcon Identity [runtime enforcement]; Palo Alto Networks/CyberArk [identity governance platform]

Supply Chain

Model scanning + provenance before deploy

Compromised model hits production

Provenance from registry to runtime?

JFrog Agent Skills Registry [pre-deployment]; CrowdStrike Falcon

Five-layer governance audit matrix. Three or more unanswered vendor questions indicate ungoverned agents in production. [runtime enforcement] = inline controls active during agent execution. [pre-deployment] = controls applied before artifacts reach runtime. [pre-prod validation] = proving-ground testing before production rollout. [AI Factory validated design] = Nvidia reference architecture integration, not OpenShell-launch coupling.

CrowdStrike’s Falcon platform embeds at four distinct enforcement points in the Nvidia OpenShell runtime: AIDR at the prompt-response-action layer, Falcon Endpoint on DGX Spark and DGX Station hosts, Falcon Cloud Security across AI-Q Blueprint deployments, and Falcon Identity for agent privilege boundaries. Palo Alto Networks enforces at the BlueField DPU hardware layer within Nvidia’s AI Factory validated design. JFrog governs the artifact supply chain from the registry through signing. WWT validates the full stack pre-production in a live environment. Cisco runs an independent guardrail at the prompt layer.

CrowdStrike and Nvidia are also building what they call intent-aware controls. That phrase matters. An agent constrained to certain data is access-controlled. An agent whose planning loop is monitored for behavioral drift is governed. Those are different security postures, and the gap between them is where the 4% error rate at 5x speed becomes dangerous.

Why the blast radius math changed

Daniel Bernard, CrowdStrike’s chief business officer, told VentureBeat in an exclusive interview what the blast radius of a compromised AI agent looks like compared to a compromised human credential.

“Anything we could think about from a blast radius before is unbounded,” Bernard said. “The human attacker needs to sleep a couple of hours a day. In the agentic world, there’s no such thing as a workday. It’s work-always.”

That framing tracks with architectural reality. A human insider with stolen credentials works within biological limits: typing speed, attention span, a schedule. An AI agent with inherited credentials operates at compute speed across every API, database, and downstream agent it can reach. No fatigue. No shift change. CrowdStrike’s 2026 Global Threat Report puts the fastest observed eCrime breakout at 27 seconds and average breakout times at 29 minutes. An agentic adversary doesn’t have an average. It runs until you stop it.

When VentureBeat asked Bernard about the 96% accuracy number and what happens in the 4%, his answer was operational, not promotional: “Having the right kill switches and fail-safes so that if the wrong thing is decided, you’re able to quickly get to the right thing.” The implication is worth sitting on. 96% accuracy at 5x speed means the errors that get through arrive five times faster than they used to. The oversight architecture has to match the detection speed. Most SOCs are not designed for that.

Bernard’s broader prescription: “The opportunity for customers is to transform their SOCs from history museums into autonomous fighting machines.” Walk into the average enterprise SOC and inventory what’s running there. He’s not wrong.

On analyst oversight when agents get it wrong, Bernard drew the governance line: “We want to keep not only agents in the loop, but also humans in the loop of the actions that the SOC is taking when that variance in what normal is realized. We’re on the same team.”

The full vendor stack

Each of the five vendors occupies a different enforcement point the other four do not. CrowdStrike’s architectural depth in the matrix reflects four announced OpenShell integration points; security leaders should weigh all five based on their existing tooling and threat model.

Cisco shipped Secure AI Factory with AI Defense, extending Hybrid Mesh Firewall enforcement to Nvidia BlueField DPUs and adding AI Defense guardrails to the OpenShell runtime. In multi-vendor deployments, Cisco AI Defense and Falcon AIDR run as parallel guardrails: AIDR enforcing inside the OpenShell sandbox, AI Defense enforcing at the network perimeter. A poisoned prompt that evades one still hits the other.

Palo Alto Networks runs Prisma AIRS on Nvidia BlueField DPUs as part of the Nvidia AI Factory validated design, offloading inspection to the data processing unit at the network hardware layer, below the hypervisor and outside the host OS kernel. This integration is best understood as a validated reference architecture pairing rather than a tight OpenShell runtime coupling. Palo Alto intercepts east-west agent traffic on the wire; CrowdStrike monitors agent process behavior inside the runtime. Same cloud runtime row, different integration model and maturity stage.

JFrog announced the Agent Skills Registry, a system of record for MCP servers, models, agent skills, and agentic binary assets within Nvidia’s AI-Q architecture. Early integration with Nvidia has been validated, with full OpenShell support in active development. JFrog Artifactory will serve as a governed registry for AI skills, scanning, verifying, and signing every skill before agents can adopt it. This is the only pre-deployment enforcement point in the stack. As Chief Strategy Officer Gal Marder put it: “Just as a malicious software package can compromise an application, an unvetted skill can guide an agent to perform harmful actions.”

Worldwide Technology launched a Securing AI Lab inside its Advanced Technology Center, built on Nvidia AI factories and the Falcon platform. WWT’s vendor-agnostic ARMOR framework is a pre-production validation and proving-ground capability, not an inline runtime control. It validates how the integrated stack behaves in a live AI factory environment before any agent touches production data, surfacing control interactions, failure modes, and policy conflicts before they become incidents.

Three MDR numbers: what they actually measure

On the MDR side, CrowdStrike fine-tuned Nvidia Nemotron models on first-party threat data and operational SOC data from Falcon Complete engagements. Internal benchmarks show 5x faster investigations, 3x higher triage accuracy in high-confidence benign classification, and 96% accuracy in generating investigation queries within Falcon LogScale. Kroll, a global risk advisory and managed security firm that runs Falcon Complete as its MDR backbone, confirmed the results in production.

Because Kroll operates Falcon Complete as its core MDR platform rather than as a neutral third-party evaluator, their validation is operationally meaningful but not independent in the audit sense. Industry-wide third-party benchmarks for agentic SOC accuracy do not yet exist. Treat reported numbers as indicative, not audited.

The 5x investigation speed compares average agentic investigation time (8.5 minutes) against the longest observed human investigation in CrowdStrike’s internal testing: a ceiling, not a mean. The 3x triage accuracy measures one internal model against another. The 96% accuracy applies specifically to generating Falcon LogScale investigation queries via natural language, not to overall threat detection or alert classification.

JFrog’s Agent Skills Registry operates beneath all four CrowdStrike enforcement layers, scanning, signing, and governing every model and skill before any agent can adopt it — with early Nvidia integration validated and full OpenShell support in active development.

Six enterprises are already in deployment

EY selected the CrowdStrike-Nvidia stack to power Agentic SOC services for global enterprises. Nebius ships with Falcon integrated into its AI cloud from day one. CoreWeave CISO Jim Higgins signed off on the Blueprint. Mondelēz North America Regional CISO Emmett Koen said the capability lets his team “focus on higher-value response and decision-making.”

MGM Resorts International CISO Bryan Green endorsed WWT’s validated testing environments, saying enterprises need “validated environments that embed protection from the start.” These range from vendor selection and platform validation to production integration. The signal is converging across buyer types, not uniform at-scale deployment.

What the five-vendor stack does not cover

The governance framework above represents real progress. It also has three holes that every security leader deploying agentic AI will eventually hit. No vendor at GTC closed any of them. Knowing where they are is as important as knowing what shipped.

  1. Agent-to-agent trust. When agents delegate to other agents, credentials compound. The OWASP Top 10 for Agentic Applications lists tool call hijacking and orchestrator manipulation as top-tier risks. Independent research from BlueRock Security scanning over 7,000 MCP servers found 36.7% contain vulnerabilities. An arXiv preprint study across 847 scenarios found a 23 to 41% increase in attack success rates in MCP integrations versus non-MCP. No vendor at GTC demonstrated a complete trust policy framework for agent-to-agent delegation. This is the layer where the 82:1 identity ratio becomes a governance crisis, not just an inventory problem.

  2. Memory integrity. Agents with persistent memory create an attack surface that stateless LLM deployments do not have. Poison an agent’s long-term memory once. Influence its decisions weeks later. The OWASP Agentic Top 10 flags this explicitly. CrowdStrike’s intent-aware controls are the closest architectural response announced at GTC. Implementation details remain forward-looking.

  3. Registry-to-runtime provenance. JFrog’s Agent Skills Registry addresses the registry side of this problem. The gap that remains is the last mile: end-to-end provenance requires proving the model executing in production is the exact artifact scanned and signed in the registry. That cryptographic continuity from registry to runtime is still an engineering problem, not a solved capability.

What running five vendors actually costs

The governance matrix is a coverage map, not an implementation plan. Running five vendors across five enforcement layers introduces real operational overhead that the GTC announcements did not address. Someone has to own policy orchestration: deciding which vendor’s guardrail wins when AIDR and AI Defense return conflicting verdicts on the same prompt. Someone has to normalize telemetry across Falcon LogScale, Prisma AIRS, and JFrog Artifactory into a single incident workflow. And someone has to manage change control when one vendor ships a runtime update that shifts how another vendor’s enforcement layer behaves.

A realistic phased rollout looks like this: start with the supply chain layer (JFrog), because it operates pre-deployment and has no runtime dependencies on the other four. Add identity governance (Falcon Identity) second, because scoped agent credentials limit blast radius before you instrument the runtime. Then instrument the agent decision layer (Falcon AIDR or Cisco AI Defense, depending on your existing vendor footprint), then cloud runtime, then local execution. Running all five simultaneously from day one is an integration project, not a configuration task. Budget for it accordingly.

What to do before your next board meeting

Here is what every CISO should be able to say after running the framework above: “We have audited every autonomous agent against five governance layers. Here is what’s in place, and here are the five questions we are holding vendors to.” If you cannot say that today, the issue is not that you are behind schedule. The issue is that no schedule existed. Five vendors just shipped the architectural scaffolding for one.

Do four things before your next board meeting:

  1. Run the five-layer audit. Pull every autonomous agent your organization has in production or staging. Map each one against the five governance rows above. Mark which vendor questions you can answer and which you cannot.

  2. Count the unanswered questions. Three or more means ungoverned agents in production. That is your board number, not a backlog item.

  3. Pressure-test the three open gaps. Ask your vendors, explicitly: How do you handle agent-to-agent trust across MCP delegation chains? How do you detect memory poisoning in persistent agent stores? Can you show a cryptographic binding between the registry scan and the runtime load? None of the five vendors at GTC has a complete answer. That is not an accusation. It is where the next year of agentic security gets built.

  4. Establish the oversight model before you scale. Bernard put it plainly: keep agents and humans in the loop. 96% accuracy at 5x speed means errors arrive faster than any SOC designed for human-speed detection can catch them. The kill switches and fail-safes have to be in place before the agents run at scale, not after the first missed breach.

The scaffolding is necessary. It is not sufficient. Whether it changes your posture depends on whether you treat the five-layer framework as a working instrument or skip past it in the vendor deck.

‘People are facing sophisticated, global, organized criminal networks’: Google joins forces with online partners to fight scams and fraud

Google has joined the Industry Accord Against Online Scams & Fraud, promising more efforts in the fight.

Sears Exposed AI Chatbot Phone Calls and Text Chats to Anyone on the Web

Customer conversations with chatbots can include contact information and personal details that make it easier for scammers to launch phishing attacks and commit fraud.

OpenClaw can bypass your EDR, DLP and IAM without triggering a single alert

An attacker embeds a single instruction inside a forwarded email. An OpenClaw agent summarizes that email as part of a normal task. The hidden instruction tells the agent to forward credentials to an external endpoint. The agent complies — through a sanctioned API call, using its own OAuth tokens.

The firewall logs HTTP 200. EDR records a normal process. No signature fires. Nothing went wrong by any definition your security stack understands.
That is the problem. Six independent security teams shipped six OpenClaw defense tools in 14 days. Three attack surfaces survived every one of them.

The exposure picture is already worse than most security teams know. Token Security found that 22% of its enterprise customers have employees running OpenClaw without IT approval, and Bitsight counted more than 30,000 publicly exposed instances in two weeks, up from roughly 1,000. Snyk’s ToxicSkills audit adds another dimension: 36% of all ClawHub skills contain security flaws.

Jamieson O’Reilly, founder of Dvuln and now security adviser to the OpenClaw project, has been one of the researchers pushing fixes hardest from inside. His credential leakage research on exposed instances was among the earliest warnings the community received. Since then, he has worked directly with founder Peter Steinberger to ship dual-layer malicious skill detection and is now driving a capabilities specification proposal through the agentskills standards body.

The team is clear-eyed about the security gaps, he told VentureBeat. “It wasn’t designed from the ground up to be as secure as possible,” O’Reilly said. “That’s understandable given the origins, and we’re owning it without excuses.”

None of it closes the three gaps that matter most.

Three attack surfaces your stack cannot see

The first is runtime semantic exfiltration. The attack encodes malicious behavior in meaning, not in binary patterns, which is exactly what the current defense stack cannot see.

Palo Alto Networks mapped OpenClaw to every category in the OWASP Top 10 for Agentic Applications and identified what security researcher Simon Willison calls a “lethal trifecta”: private data access, untrusted content exposure, and external communication capabilities in a single process. EDR monitors process behavior. The agent’s behavior looks normal because it is normal. The credentials are real, and the API calls are sanctioned, so EDR reads it as a credentialed user doing expected work. Nothing in the current defense ecosystem tracks what the agent decided to do with that access, or why.

The second is cross-agent context leakage. When multiple agents or skills share session context, a prompt injection in one channel poisons decisions across the entire chain. Giskard researchers demonstrated this in January 2026, showing that agents silently appended attacker-controlled instructions to their own workspace files and waited for commands from external servers. The injected prompt becomes a sleeper payload. Palo Alto Networks researchers Sailesh Mishra and Sean P. Morgan warned that persistent memory turns these attacks into stateful, delayed-execution chains. A malicious instruction hidden inside a forwarded message sits in the agent’s context weeks later, activating during an unrelated task.

O’Reilly identified cross-agent context leakage as the hardest of these gaps to close. “This one is especially difficult because it is so tightly bound to prompt injection, a systemic vulnerability that is far bigger than OpenClaw and affects every LLM-powered agent system in the industry,” he told VentureBeat. “When context flows unchecked between agents and skills, a single injected prompt can poison or hijack behavior across the entire chain.” No tool in the current ecosystem provides cross-agent context isolation. IronClaw sandboxes individual skill execution. ClawSec monitors file integrity. Neither tracks how context propagates between agents in the same workflow.

The third is agent-to-agent trust chains with zero mutual authentication. When OpenClaw agents delegate tasks to other agents or external MCP servers, no identity verification exists between them. A compromised agent in a multi-agent workflow inherits the trust of every agent it communicates with. Compromise one through prompt injection, and it can issue instructions to every agent in the chain using trust relationships that the legitimate agent already built.

Microsoft’s security team published guidance in February calling OpenClaw untrusted code execution with persistent credentials, noting the runtime ingests untrusted text, downloads and executes skills from external sources, and performs actions using whatever credentials it holds. Kaspersky’s enterprise risk assessment added that even agents on personal devices threaten organizational security because those devices store VPN configs, browser tokens, and credentials for corporate services. The Moltbook social network for OpenClaw agents already demonstrated the spillover risk: Wiz researchers found a misconfigured database that exposed 1.5 million API authentication tokens and 35,000 email addresses.

What 14 days of emergency patching actually closed

The defense ecosystem split into three approaches. Two tools harden OpenClaw in place. ClawSec, from Prompt Security (a SentinelOne company), wraps agents in continuous verification, monitoring critical files for drift and enforcing zero-trust egress by default. OpenClaw’s VirusTotal integration, shipped jointly by Steinberger, O’Reilly, and VirusTotal’s Bernardo Quintero, scans every published ClawHub skill and blocks known malicious packages.

Two tools are full architectural rewrites. IronClaw, NEAR AI’s Rust reimplementation, runs all untrusted tools inside WebAssembly sandboxes where tool code starts with zero permissions and must explicitly request network, filesystem, or API access. Credentials get injected at the host boundary and never touch agent code, with built-in leak detection scanning requests and responses. Carapace, an independent open-source project, inverts every dangerous OpenClaw default with fail-closed authentication and OS-level subprocess sandboxing.

Two tools focus on scanning and auditability: Cisco’s open-source scanner combines static, behavioral, and LLM semantic analysis, while NanoClaw reduces the entire codebase to roughly 500 lines of TypeScript, running each session in an isolated Docker container.

O’Reilly put the supply chain failure in direct terms. “Right now, the industry basically created a brand-new executable format written in plain human language and forgot every control that should come with it,” he said. His response has been hands-on. He shipped the VirusTotal integration before skills.sh, a much larger repository, adopted a similar pattern. Koi Security’s audit validates the urgency: 341 malicious skills found in early February grew to 824 out of 10,700 on ClawHub by mid-month, with the ClawHavoc campaign planting the Atomic Stealer macOS infostealer inside skills disguised as cryptocurrency trading tools, harvesting crypto wallets, SSH credentials, and browser passwords.

OpenClaw Security Defense Evaluation Matrix

Dimension

ClawSec

VirusTotal Integration

IronClaw

Carapace

NanoClaw

Cisco Scanner

Discovery

Agents only

ClawHub only

No

mDNS scan

No

No

Runtime Protection

Config drift

No

WASM sandbox

OS sandbox + prompt guard

Container isolation

No

Supply Chain

Checksum verify

Signature scan

Capability grants

Ed25519 signed

Manual audit (~500 LOC)

Static + LLM + behavioral

Credential Isolation

No

No

WASM boundary injection

OS keychain + AES-256-GCM

Mount-restricted dirs

No

Auditability

Drift logs

Scan verdicts

Permission grant logs

Prometheus + audit log

500 lines total

Scan reports

Semantic Monitoring

No

No

No

No

No

No

Source: VentureBeat analysis based on published documentation and security audits, March 2026.

The capabilities spec that treats skills like executables

O’Reilly submitted a skills specification standards update to the agentskills maintainers, led primarily by Anthropic and Vercel, that is in active discussion. The proposal requires every skill to declare explicit, user-visible capabilities before execution. Think mobile app permission manifests. He noted the proposal is getting strong early feedback from the security community because it finally treats skills like the executables they are.

“The other two gaps can be meaningfully hardened with better isolation primitives and runtime guardrails, but truly closing context leakage requires deep architectural changes to how untrusted multi-agent memory and prompting are handled,” O’Reilly said. “The new capabilities spec is the first real step toward solving these challenges proactively instead of bolting on band-aids later.”

What to do on Monday morning

Assume OpenClaw is already in your environment. The 22% shadow deployment rate is a floor. These six steps close what can be closed and document what cannot.

  1. Inventory what is running. Scan for WebSocket traffic on port 18789 and mDNS broadcasts on port 5353. Watch corporate authentication logs for new App ID registrations, OAuth consent events, and Node.js User-Agent strings. Any instance running a version before v2026.2.25 is vulnerable to the ClawJacked remote takeover flaw.

  2. Mandate isolated execution. No agent runs on a device connected to production infrastructure. Require container-based deployment with scoped credentials and explicit tool whitelists.

  3. Deploy ClawSec on every agent instance and run every ClawHub skill through VirusTotal and Cisco’s open-source scanner before installation. Both are free. Treat skills as third-party executables, because that is what they are.

  4. Require human-in-the-loop approval for sensitive agent actions. OpenClaw’s exec approval settings support three modes: security, ask, and allowlist. Set sensitive tools to ask so the agent pauses and requests confirmation before executing shell commands, writing to external APIs, or modifying files outside its workspace. Any action that touches credentials, changes configurations, or sends data to an external endpoint should stop and wait for a human to approve it.

  5. Map the three surviving gaps against your risk register. Document whether your organization accepts, mitigates, or blocks each one: runtime semantic exfiltration, cross-agent context leakage, and agent-to-agent trust chains.

  6. Bring the evaluation table to your next board meeting. Frame it not as an AI experiment but as a critical bypass of your existing DLP and IAM investments. Every agentic AI platform that follows will face this same defense cycle. The framework transfers to every agent tool your team will assess for the next two years.

The security stack you built for applications and endpoints catches malicious code. It does not catch an agent following a malicious instruction through a legitimate API call. That is where these three gaps live.

Massive Interpol operation takes down 45,000 IP addresses and leads to 94 arrests

Operation Synergia III took half a year and included 72 countries

Poland blocks cyberattack that targeted nuclear research facility

Politicians are blaming Iran but not everyone is sold on that idea.

Companies House online filing back to normal after glitch allowed users to change directors’ details

In theory, someone could have exfiltrated sensitive data, but it would be an extremely tedious undertaking.

Fortinet patches FortiGate Firewall vulnerabilities that allowed hackers to steal enterprise credentials

Three bugs were recently fixed, all three with a critical severity score.

UK businesses are putting themselves at risk due to poor cyber hygiene

Employee accounts are a breeding ground for attacks, with poor management, broad permissions, and more.

‘100 Video Calls Per Day’: Models Are Applying to Be the Face of AI Scams

Dozens of Telegram channels reviewed by WIRED include job listings for “AI face models.” The (mostly) women who land these gigs are likely being used to dupe victims out of their money.