3 min read

AI outpaces human security teams at finding Firefox bugs

Claude AI found more Firefox security flaws in 14 days than human teams typically catch monthly, but could only exploit two of them. The breakthrough shows AI's promise for bug detection whilst highlighting why human expertise remains essential.

AI outpaces human security teams at finding Firefox bugs

Anthropic's AI model Claude Opus 4.6 discovered 22 vulnerabilities in Mozilla Firefox within two weeks — more than human security teams reported in any single month of 2025. As reported by PCMag, the findings highlight AI's potential to accelerate cybersecurity testing, though significant limitations remain.

What did Claude Opus 4.6 discover in Firefox?

According to Mozilla researchers, Claude Opus 4.6 identified 22 vulnerabilities and 100 bugs overall during its two-week analysis of Firefox. Of these, 14 were classified as high-severity vulnerabilities — nearly a fifth of the 73 high-severity Firefox vulnerabilities Mozilla fixed throughout 2025.

"In other words: AI is making it possible to detect severe security vulnerabilities at highly accelerated speeds," Mozilla researchers concluded. The findings represent a significant leap in automated vulnerability detection, potentially transforming how software companies approach security testing.

For the UAE's growing tech sector, this development could accelerate security audits for local AI initiatives and critical infrastructure projects. However, the technology's current limitations suggest human oversight remains essential.

Why AI struggles with actual exploitation

Despite Claude's success at finding vulnerabilities, the AI struggled to convert discoveries into working exploits. Opus 4.6 successfully exploited only two of the 22 vulnerabilities it identified, creating what Mozilla researchers described as "crude browser exploits" unlikely to work in real-world scenarios due to existing safeguards.

This limitation highlights a crucial distinction in cybersecurity: finding a vulnerability requires pattern recognition, whilst exploiting it demands understanding of complex system interactions and defensive measures. The gap suggests that whilst AI can accelerate the detection phase, human expertise remains critical for understanding real-world impact.

The AI false positive problem

Daniel Stenberg, lead developer at software firm curl, warns of an "explosion in AI slop reports" following increased AI adoption for bug hunting. According to Stenberg's comments to The Wall Street Journal, fewer than one in 20 bugs reported to curl in 2025 were actually real.

"The AI chatbots still easily hallucinate security problems," Stenberg noted. This echoes broader concerns about AI reliability that have emerged across various industries adopting AI tools, where false positives can waste valuable development time and resources.

The challenge illustrates why human validation remains essential in AI-assisted security workflows, particularly for organisations managing critical systems or sensitive data.

What this means for cybersecurity

Anthropic's results coincide with the company's launch of Claude Code Security, a dedicated tool designed to highlight software vulnerabilities and suggest targeted fixes for human review. The timing suggests increasing commercial interest in AI-driven security tools, potentially disrupting traditional cybersecurity approaches.

For UAE businesses, this development could democratise advanced security testing previously available only to large enterprises. However, successful implementation will require balancing AI efficiency gains with human expertise to filter false positives and validate genuine threats.

The Mozilla findings demonstrate AI's potential to identify vulnerabilities at unprecedented speed, but the exploitation gap suggests we're still years away from fully automated security testing. The technology works best as an accelerator for human security teams rather than a replacement.

Claude Code Security availability

Anthropic launched Claude Code Security earlier this month as part of its broader push into cybersecurity applications. The tool is designed to integrate into existing development workflows, providing vulnerability assessments and suggested fixes for human review.

Whilst Anthropic hasn't announced UAE-specific pricing or partnerships, the company's tools are generally available through its API and enterprise channels. Local organisations interested in AI-assisted security testing should expect to validate all findings through traditional security review processes.

Frequently Asked Questions

How many vulnerabilities did Anthropic's Claude find in Firefox?

Claude Opus 4.6 discovered 22 vulnerabilities in Mozilla Firefox over two weeks, including 14 classified as high-severity. This exceeded the number of vulnerabilities reported by human teams in any single month of 2025.

Is AI better than humans at finding software bugs?

AI excels at rapid vulnerability detection but struggles with exploitation and accuracy. Mozilla's Claude found more Firefox bugs in two weeks than human teams reported monthly in 2025, but AI tools often produce false positives requiring human validation.

What is Claude Code Security?

Claude Code Security is Anthropic's new cybersecurity tool designed to identify software vulnerabilities and suggest targeted fixes for human review. It represents the company's push into AI-driven security testing for enterprise applications.

Can AI actually exploit the vulnerabilities it finds?

Currently, no. Claude Opus 4.6 successfully exploited only 2 of the 22 vulnerabilities it identified, creating basic exploits unlikely to work in real-world scenarios. AI excels at detection but requires human expertise for exploitation assessment.

Subscribe to our newsletter

Subscribe to our newsletter to get the latest updates and news

Member discussion