×
Register Here to Apply for Jobs or Post Jobs. X

LLMOps Engineer

Job in Redwood City, San Mateo County, California, 94061, USA
Listing for: Cognichip
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer, Cloud Computing, Data Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Staff LLMOps Engineer

Overview

At Cognichip, we are building the next generation, enterprise product suite to empower semiconductor design engineers to achieve a 10x productivity boost with proprietary AI/ML models and modern cloud technologies.

We are seeking a Staff LLMOps Engineer to architect, deploy, and optimize our large language model (LLM) infrastructure on the cloud. This role focuses on taking trained models to production, scaling them efficiently across GPU clusters, and driving innovations in inference optimization. You will work closely with AI scientists, Dev Ops, and platform teams to ensure low-latency, high-throughput model serving for our enterprise SaaS product.

Core

Responsibilities
  • Design and implement production-ready LLM deployment pipelines on AWS and Kubernetes/EKS.
  • Build and scale LLM inference infrastructure (multi-GPU, multi-node) for high availability, low latency, and cost efficiency.
  • Optimize inference performance using vLLM, SGLang, or similar frameworks.
  • Implement advanced serving techniques: continuous batching, speculative decoding, KV-cache management, paged attention, and distributed scheduling.
  • Collaborate with AI researchers to operationalize model training outputs into production-grade services.
  • Establish monitoring and observability for LLM serving: latency, throughput, GPU utilization, failure recovery.
  • Drive automation of infrastructure provisioning, scaling, and updates using IaC (Terraform) and CI/CD pipelines.
  • Partner with security and compliance teams to ensure secure multi-tenant model hosting aligned with enterprise-grade requirements.
Required Qualifications
  • 5+ years of experience in Dev Ops/AI infrastructure, with 2+ years focused on LLMOps (production deployment & optimization).
  • Proven track record of deploying and scaling LLMs in production environments.
  • Hands-on experience with GPU-accelerated inference and distributed AI serving.
  • Strong understanding of cloud-native architectures and secure enterprise SaaS deployment.
What We Offer
  • Opportunity to own and scale LLM infrastructure at a disruptive AI startup.
  • Competitive compensation package, including equity participation.
  • A team of high-caliber collaborators at the intersection of AI, cloud, and semiconductor design.
  • A culture of innovation, precision, and impact, where your work directly shapes the future of engineering.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary