Senior AI Software Engineer Job San Francisco area,California USA,Software Development

Bishop Fox is the leading authority in offensive security, providing solutions ranging from continuous penetration testing, red teaming, and attack surface management to product, cloud, and application security assessments. We’ve worked with more than a quarter of the Fortune 100, half of the Fortune 10, eight of the top 10 global technology companies, and all of the top global media companies.

Our managed service platform, service innovation, and culture of excellence continue to gather accolades from industry award programs including Fast Company, Inc., SC Media, and others. For more than 16 years, we've been contributing and giving back to the security community. We’ve published more than 16 open source tools and 50 security advisories in the last five years alone. Learn more at or follow us on social media.

Who

You Are

This isn't just another engineering role. You'll be joining what's essentially a startup within Bishop Fox – all the innovation and rapid iteration of an early-stage venture, powered by the resources and reputation of an established security leader.

You are an experienced AI engineer who thrives on building real systems that operate in messy, unpredictable environments. You care deeply about reliability, evaluation, and scale—not just whether something works once, but whether it works consistently in production.

Your mission? Build autonomous AI agents that identify genuine vulnerabilities in production applications, capable of thinking, adapting, and hacking like the world's top penetration testers.

What You Will Do

Pioneer AI-Driven Security Testing

Design and build intelligent autonomous security testing agents using large language models and cutting-edge AI/ML techniques
Create systems that can autonomously perform reconnaissance, identify vulnerabilities, and execute sophisticated attack chains
Push the boundaries of what's possible when artificial intelligence meets offensive security
Build robust planning, tool-use, and failure-handling mechanisms for agents operating in real-world, unpredictable applications

Revolutionize Pen testing at Scale

Develop services that think and act like elite attackers, but operate 24/7 across thousands of targets
Build systems that continuously evolve and improve their attack strategies
Implement long-running agent memory and context management so agents retain state, avoid redundant actions, and accumulate application knowledge

Integrate with Enterprise-Grade Infrastructure

Connect your AI agents into Bishop Fox's Cosmos cloud platform
Scale your creations to serve Fortune 100 clients with enterprise-level reliability
Design architectures that can handle massive concurrent testing operations
Prototype breakthrough approaches to AI-driven security testing
Build sophisticated feedback loops that make your agents smarter over time
Implement safety mechanisms and ethical guardrails for responsible AI deployment
Measure, iterate, and continuously enhance agent performance
Design evaluation and monitoring systems that distinguish real vulnerabilities from false positives or hallucinated findings

Collaborate with Elite Security Minds

Work directly with world-class penetration testers and security researchers
Partner with data scientists and AI specialists to solve novel technical challenges
Contribute to a team culture where hacking expertise meets cutting-edge artificial intelligence
Apply real-world production feedback from customer environments to refine agent behavior and system reliability

Your Experience

6+ years of software engineering experience with a track record of shipping production systems
Deep AI/ML expertise – hands-on experience with LLMs, agent frameworks (Lang Chain, AutoGPT, CrewAI), or autonomous AI systems
Advanced programming skills in Python and Golang with clean, scalable code practices
Full-stack capabilities – comfortable building robust APIs, designing database schemas, and working with modern frontend frameworks (React/Type Script experience valued)
Practical experience designing, evaluating, and improving agent reliability, including handling failures, edge cases, and non-deterministic behavior
Experience with cloud platforms (AWS, GCP, or…


Increase/decrease your Search Radius (miles)



Job Posting Language