More jobs:
Senior AIOps & MLOps Engineer
Job in
Fremont, Alameda County, California, 94537, USA
Listed on 2026-06-02
Listing for:
Veriipro
Full Time
position Listed on 2026-06-02
Job specializations:
-
IT/Tech
AI Engineer, Cloud Computing
Job Description & How to Apply Below
Key Responsibilities
- Technical Operations:
Review, implement, and support enterprise-level AI platforms and services to drive IT operation excellence. Ensuring that new use cases are onboarded smoothly and operationalized - Optimization:Analyze business processes to identify areas for automation and work with business stakeholders and IT teams to determine requirements and design software bots to reduce operational toil.
- AI Ops & Model Deployment:
Lead the operationalization and deployment of AI/ML models into production environments, ensuring they are highly available, scalable, and performant. Implement and monitor Continuous Integration (CI) and Continuous Deployment (CD) pipelines. - Python Development:Design and develop Python-based solutions for automating and managing the lifecycle of AI/ML models, including data ingestion, model training, and real-time prediction workflows.
- API Integration:
Build and maintain robust APIs for model serving and integration with other systems. Ensure seamless communication between models, data pipelines, and consumer applications. - LLM Concepts and Implementation:Apply knowledge of Large Language Models (LLMs) to develop AI-driven applications and services, ensuring models are optimized and performing efficiently in production.
- ML Ops:Implement and maintain Machine Learning Operations (ML Ops) practices for version control, monitoring, logging, and debugging of AI/ML models in production. Support model retraining, versioning, and A/B testing.
- Cloud Infrastructure:Leverage Azure Cloud services for hosting and scaling AI applications, ensuring security, compliance, and performance. Implement infrastructure as code (IaC) using tools like Azure Dev Ops.
- Collaboration:Work closely with backend engineers, data engineers/developers, infrastructure engineers, operational SMEs, and business stakeholders to tackle evolving challenges in the field ofAI/ML to ensure AI solutions meet business requirements and performance benchmarks.
- Monitoring & Optimization:Continuously monitor the performance of deployed AI models and optimize them for efficiency, cost-effectiveness, and accuracy. Implement alerting and logging mechanisms by scripts or through an observability solution.
- Documentation & Best Practices:Document AI Ops processes, Use cases, tools, and workflows. Establish and enforce best practices for managing AI models in production environments.
Skills & Qualifications
- Experience:10-15 years of experience in software development, with a focus on AI/ML operations, cloud infrastructure, and Dev Ops practices.
- Python:Advanced proficiency in Python, including experience with AI/ML libraries such as Tensor Flow, PyTorch, scikit-learn, and Pandas.
- APIs:Strong experience in designing, developing, and maintaining RESTful APIs for AI/ML model deployment and integration.
- ML Ops:In-depth understanding of Machine Learning Operations, including model versioning, monitoring, deployment, and automation of ML workflows.
- LLM Concepts:Familiarity with Large Language Models (LLMs), including experience working with transformer-based models such as GPT, BERT, or T5.
- Azure Cloud:Hands-on experience with Azure Cloud services (Azure ML, Azure Dev Ops, Azure Functions, etc.) and cloud infrastructure management.
- Dev Ops & CI/CD:Proficient in setting up CI/CD pipelines for AI/ML models and using tools like Jenkins, Git Lab, or Azure Dev Ops for automation.
- Data Management & Tools:Experience working with data storage and processing tools like Azure Blob Storage, Azure SQL Database, Kafka, or similar.
- Version Control:Expertise with Git and version control best practices for collaborative development of AI systems.
- Problem Solving:Strong analytical and troubleshooting skills, with the ability to identify root causes and optimize AI/ML models and systems.
- Communication & Collaboration:Excellent communication skills and the ability to work effectively in a cross-functional team environment.
- Cloud
Certifications:
Azure certifications such as Azure Solutions Architect, Azure AI Engineer, or Azure Dev Ops Engineer. - Security & Compliance:
Understanding of security best practices in AI model deployment and experience with secure handling of sensitive data in the cloud. - Big Data Tools:
Familiarity with big data processing frameworks (e.g., Apache Spark, Hadoop) and integration with AI/ML pipelines. - Agile Methodologies:
Experience working in Agile teams, with knowledge of Scrum, Kanban, or similar frameworks.
Position Requirements
10+ Years
work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×