×
Register Here to Apply for Jobs or Post Jobs. X
More jobs:

MLOps Engineer — GCP​/GKE, vLLM Serving & Production Reliability

Job in Bengaluru, 560001, Bangalore, Karnataka, India
Listing for: AIBound
Full Time position
Listed on 2026-02-14
Job specializations:
  • IT/Tech
    AI Engineer
Job Description & How to Apply Below
Location: Bengaluru

Company Description

AIBound is revolutionizing AI security with the industry's first unified control plane for secure AI adoption. We discover, test, and protect each AI model, agent, and identity—catching AI risks before impact so enterprises can innovate safely and  AI adoption outpaces security across global organizations, AIBound eliminates the dangerous gap between innovation and protection.

Led by our CEO and founder, the former CISO at Palo Alto Networks and Workday, AIBound brings together a world-class team of cybersecurity veterans who have secured some of the world's most advanced enterprises. We're a fast-growing company backed by leading investors, positioned at the critical intersection of AI innovation and enterprise security—one of the most strategic technology frontiers of our generation.

Join us in building the future of AI security, where cutting-edge artificial intelligence meets battle-tested cybersecurity expertise.

Role

AIBound ships AI security capabilities that must be fast, reliable, secure, and cost-controlled in real enterprise environments. We’re hiring an MLOps Engineer to product ionize and operate our LLM services on GCP using GKE, with a strong focus on high-performance serving (vLLM), safe rollout strategies, monitoring, and operational excellence.

You’ll work closely with AI and data engineers to ensure what we build can be deployed, scaled, and trusted.

Responsibilities

- Deploy and operate LLM inference services on GCP using GKE
- Implement high-performance serving with vLLM (or comparable LLM serving stack)
- Build inference APIs using FastAPI and containerize services with Docker
- Implement autoscaling (HPA, GPU-aware scaling, traffic-based scaling), capacity planning, and SLOs
- Set up monitoring/logging/alerting for latency, error rates, throughput, GPU utilization, token usage
- Own CI/CD for model + service deployments, including rollback/canary strategies
- Implement production controls: secrets management, IAM, network policies, dependency scanning
- Drive cost optimization: caching, batching, quantization awareness, right-sizing, cold-start reduction

Qualifications

- 1–2 years of hands-on MLOps / platform / backend deployment experience
- Strong experience with GCP and GKE
- Solid Kubernetes + Docker fundamentals (deployments, services, configmaps/secrets, ingress)
- Experience serving models via vLLM (preferred) or similar serving frameworks
- Proficiency with FastAPI (or equivalent)
- Practical experience with CI/CD, monitoring, autoscaling, and rollback patterns

Benefits & Culture

- Highly competitive salary and equity package
- Hybrid work environment (2 days on‑site per week), and vacation policy
- Comprehensive health benefits
- Professional development budget, conference attendance and access to AI research resources.
- AIBound is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary