The $18/Hour Pentester: What Security Leaders Need to Tell Their Teams Right Now
January 2026
I’ve been staring at Stanford’s Trinity research paper for three days now, and I keep coming back to one number: $18.21 per hour.
That’s what it costs to run ARTEMIS—their AI-powered penetration testing agent that just outperformed 80% of professional pentesters in a live enterprise environment. Not a CTF. Not a lab. A real university network with 8,000+ hosts, actual users, and production systems.
And it placed second overall against 10 cybersecurity professionals.
If you’re leading a security team in 2026 and this number doesn’t fundamentally change how you’re thinking about offensive security, we need to talk.
This Isn’t About Replacement—It’s About Evolution
Let me be clear upfront: I’m not writing this to tell you AI is coming for your job. I’m writing this because I need my team to evolve faster than our adversaries do, and right now, adversaries are already using these tools.
Anthropic documented nation-state actors using AI in cyber operations. OpenAI reported similar patterns. The offensive AI revolution isn’t a future threat—it’s current reality.
The question isn’t whether AI will transform how we do security. The question is: are we evolving our teams’ capabilities as fast as the threat landscape is evolving theirs?
What the Trinity Study Actually Proves
Stanford ran a controlled experiment: 10 professional pentesters vs. AI agents (including ARTEMIS, Codex, CyAgent, and others) against the same target environment. Same scope, same time constraints, same rules of engagement.
The results:
- ARTEMIS (A1 config): 9 valid vulnerabilities, 82% accuracy, $18.21/hour
- ARTEMIS (A2 ensemble): 11 valid findings, 82% accuracy, $59/hour
- Human participants: 3-13 vulnerabilities each, varying accuracy rates, ~$60/hour
But here’s what matters more than the leaderboard: when given targeted hints about where to look, ARTEMIS found every single vulnerability humans discovered. Its bottleneck wasn’t technical execution—it was pattern recognition and target selection.
That gap? It’s closing. Fast.
The Capabilities Gap My Team Needs to Close
I’ve spent the past week thinking about what this means for how we build and develop cyber security teams. Here’s what keeps me up at night:
1. We’re Still Operating at Human Serial Processing Speed
ARTEMIS hit a peak of 8 concurrent sub-agents executing simultaneous exploitation attempts. Most cyber teams? We’re sequential. One person, one target, one exploit chain at a time.
When an AI agent can parallelize reconnaissance across dozens of hosts while my team is still waiting for nmap to finish on host #1, we have a fundamental throughput problem.
What we need to be telling our teams: Learn to orchestrate parallel operations. Use automation not as a replacement for thinking, but as a force multiplier for execution. If you’re waiting on scan results, you should have three other investigations running concurrently.
2. We’re Not Thinking in “Sessions” Yet
ARTEMIS runs for 16+ hours continuously through session management—summarizing progress, clearing context, resuming where it left off. It doesn’t suffer from context switching, meeting fatigue, or “I’ll get back to this tomorrow” syndrome.
Most cyber teams? We lose 30 minutes every time we context switch. We forget where we were. We duplicate work.
What we need to be telling our teams: Document like you’re creating resumption checkpoints. Your notes should allow anyone (including future you) to pick up exactly where you left off in 90 seconds. Treat long-term investigations like marathon runners treat pacing – sustainable progress over time, not heroic sprints.
3. We’re Still Doing What AI Already Does Better
Every ARTEMIS variant systematically found:
- Default credentials
- Misconfigured services
- Exposed management interfaces
- Known CVE exploitation
- Network enumeration patterns
These aren’t the vulnerabilities where human intuition adds value. These are the “table stakes” findings AI agents discover in the first 2 hours, every time, at scale.
What we need to be telling our teams: Stop competing on what AI does better. Specialize in what it struggles with:
- Business logic flaws that require understanding intent vs. implementation
- Complex attack chains that span multiple systems with organizational context
- Social engineering vectors that exploit human behavior patterns
- Zero-day research that requires creative hypothesis generation
- Adversarial ML understanding for AI-native attack surfaces
If your current skillset is “I’m really good at running nuclei and reviewing the output,” you’re competing with $18/hour automation. That’s not a winning position.
The Uncomfortable Conversation About the Human ROI
AI agents can find vulnerabilities. But they also submit false positives they couldn’t contextualize. They miss business logic flaws. They couldn’t explain why a finding matters to our specific business risk. And when the board asks ‘what does this mean for our Q2 product launch,’ the AI agent doesn’t have an answer.
We have to build hybrid models – AI agents for systematic coverage, human experts for contextual analysis, prioritization, and strategic guidance. The AI agent finds the vulnerability. The human team determines if it’s exploitable in their specific environment, what business impact it has, and what the remediation priority should be given their awareness of the release calendar and risk appetite.
What is clear here though: We can’t justify humans doing work AI does cheaper and better. We need to justify humans doing work AI can’t do yet.
The Harsh Truth About False Positives
One finding from the Trinity study that doesn’t get enough attention: ARTEMIS had a higher false positive rate than human participants. It reported successful authentication after seeing “200 OK” HTTP responses that were actually login page redirects.
Why? Lack of business context.
The AI agent understands HTTP status codes. It doesn’t understand that your authentication flow returns 200 on failed login attempts because your frontend framework handles routing client-side.
This is where human expertise remains critical—not in finding vulnerabilities, but in validating them within business context and prioritizing them against organizational risk.
What I’m telling my team: Your value isn’t in being faster than AI at running exploits. Your value is in understanding what a vulnerability means for our specific business, how it chains with other weaknesses, and what realistic attack scenarios exist given our threat model.
If you can’t articulate business impact and remediation priority better than an AI agent reading CVSS scores, you need to upskill urgently.
The Skills To Hire For in Cybersecurity Going Forward
When I’m reviewing resumes now, here’s what I’m looking for:
Red flags (AI-replaceable skills):
- “Expert in vulnerability scanning tools”
- “Extensive experience with automated testing frameworks”
- “Proficient in running Metasploit/Burp/etc.”
These are fine to have, but they’re not differentiators anymore.
Green flags (AI-resistant skills):
- “Discovered novel authentication bypass in OAuth implementation by understanding business logic intent vs. specification”
- “Chained three medium-severity findings into critical-impact attack scenario based on organizational context”
- “Developed custom exploitation techniques for previously unknown attack surface”
- “Translated technical vulnerability findings into business risk language for executive stakeholders”
- “Experience orchestrating AI/automated tools within security workflows”
Notice the difference? It’s not about knowing tools—it’s about applying creative thinking, contextual understanding, and strategic judgment that AI agents don’t have yet.
What Your Team Should Be Doing Monday Morning
If you’re a security leader reading this, here’s my recommendation for your next team meeting:
1. Acknowledge the Reality
Don’t sugarcoat it. AI agents cost $18/hour and are already competitive with professional pentesters on systematic vulnerability discovery. Your team needs to understand the competitive landscape they’re operating in.
2. Reframe the Value Proposition
Your team’s value isn’t in discovering vulnerabilities anymore—it’s in:
- Understanding which vulnerabilities matter in your specific business context
- Developing novel exploitation techniques for your unique attack surfaces
- Providing strategic guidance that connects technical findings to business risk
- Explaining to non-technical stakeholders what findings actually mean
3. Invest in Differentiation
Allocate training budget toward:
- Advanced exploitation techniques
- Business logic vulnerability research
- Threat intelligence and adversary tradecraft analysis
- Communication and risk articulation skills
- AI/ML security (both attacking and defending AI systems)
4. Experiment with Hybrid Models
Run a pilot: Use open-source AI agents (ARTEMIS is public) for reconnaissance on a non-critical internal application. Have your team do the same manually. Compare results, cost, and time investment.
Then discuss: Where did AI excel? Where did humans add unique value? How do we structure workflows that leverage both?
5. Build AI Literacy
Your team needs hands-on experience with AI agents to understand their capabilities and limitations. This isn’t theoretical anymore—these tools exist and adversaries are using them. Your team should be proficient in using, configuring, and orchestrating AI security agents.
The Meta-Question: Can We Afford NOT to Adapt?
Here’s what haunts me: While we’re debating whether to adopt AI agents, adversaries are already using them.
Anthropic reported nation-state actors leveraging AI in offensive operations. That means somewhere, right now, hostile actors are running AI-powered reconnaissance against targets at scale, at speeds human defenders can’t match.
The question isn’t “should we adopt AI agents in our security program?”
The question is: “Can we afford to defend at human speed against adversaries operating at AI speed?”
I don’t think we can.
The Bottom Line for Security Leaders
If you’re leading a security team in 2026, you need to answer three questions honestly:
1. What work is my team doing that AI agents already do better and cheaper?
If the answer is “a lot,” you have an urgent prioritization problem. That work should be automated now, freeing your human experts for higher-value activities.
2. What capabilities is my team developing that will remain valuable when AI agents mature further?
If the answer is “we’re focused on tool expertise,” you have an urgent skills development problem. Your team needs to specialize in areas where human judgment, context, and creativity remain critical.
3. How am I preparing my team for a future where $18/hour AI agents are baseline capability?
If the answer is “we’re not,” you have an urgent strategic planning problem. The future isn’t coming—it’s here. ARTEMIS exists, it’s open source, and adversaries are adopting these capabilities faster than defenders.
A Personal Note
I’m not writing this as a doomsayer. I’m optimistic about where this goes. But optimism requires preparation.
The security professionals on my team who embrace AI agents as force multipliers, who specialize in areas where human expertise remains critical, who learn to orchestrate hybrid human-AI workflows—they’re going to thrive. They’ll be more effective, more impactful, and more valuable than ever.
The ones who resist, who insist that “AI can’t replace human intuition” while doing work that AI demonstrably does better and cheaper—they’re going to struggle.
I know which team I want to build. I know which team I want to be part of.
The question is: which team are you building?
What’s your organization doing to prepare for AI-augmented offensive security? I’m genuinely curious—find me on LinkedIn and let’s talk about it.
An excellent series on Human Centric AI is on LinkedIN: The Frankenstein Stitch part 2: Why ‘Micro-team’ as Human Navigators Are AI’s True North






