Lead MLOps Engineer
Listed on 2025-12-20
-
IT/Tech
AI Engineer, Cloud Computing, Data Engineer, Machine Learning/ ML Engineer
Lead MLOps Engineer
Drive the future of secure, scalable, mission‑critical AI/ML systems.
Prime Solutions Group (PSG), Inc. is seeking a Lead MLOps Engineer to architect, automate, and operate advanced ML pipelines and platforms that power next‑generation defense and national security AI systems. In this hybrid leadership role, you will serve as both a senior technical expert and a hands‑on engineering lead—guiding MLOps strategy, mentoring engineers, and driving execution across high‑impact AI/ML programs.
You will integrate ML engineering, data engineering, and Dev Sec Ops practices to build secure, scalable, fully automated ML ecosystems for both cloud and on‑premise environments. This role extends PSG’s Dev Sec Ops foundation with ML‑specific tooling and governance, including experiment tracking, model registries, monitoring, drift detection, automated retraining, and performance optimization.
This is a high‑visibility opportunity to deliver enterprise‑scale AI/ML platforms and directly contribute to U.S. national security while shaping PSG’s long‑term MLOps capabilities.
Responsibilities- Lead the design, implementation, and management of ML‑focused CI/CD pipelines across development, test, staging, and production environments.
- Integrate MLOps best practices into existing Dev Sec Ops workflows, including:
- Data quality and schema validation
- Model validation and promotion gates
- Drift and performance monitoring
- Oversee secure Infrastructure‑as‑Code (IaC), containerization (Docker/Kubernetes), and cloud platforms (AWS, Azure, GCP) for ML and data workloads.
- Architect and maintain ML training and inference platforms, including experiment tracking, model registries, and automated retraining pipelines.
- Mentor and guide engineers in automation, observability, and security‑first MLOps and Dev Sec Ops practices.
- Collaborate with cross‑functional teams (data science, software, cybersecurity, IT, systems) to ensure ML systems are reliable, secure, and high‑performing.
- Lead technical risk assessments and incident response efforts for ML and data platforms.
- Stay current on emerging MLOps, data engineering, and AI platform technologies; recommend new tools and methods.
- Serve in a hybrid role as:
- Senior technical contributor on MLOps architecture and implementation
- Team lead for MLOps initiatives and platform development efforts
- Contribute hands‑on to pipeline/orchestration code, infrastructure definitions, and monitoring/alerting configuration.
- Apply engineering principles to resolve complex issues across ML, data, security, and operations.
- Evaluate ethical, operational, and mission considerations when deploying AI/ML systems.
- U.S. Citizenship (Required).
- Active Top‑Secret Clearance or higher.
- Bachelor’s degree in Computer Science, Engineering, Data Science, Applied Mathematics, or related field.
- 5–9+ years experience in:
- MLOps / ML platform engineering
- Dev Ops/Dev Sec Ops /SRE supporting ML workloads
- Data engineering integrating ML pipelines
- Applied ML in production environments
- Strong proficiency with CI/CD tools (Git Lab CI, Jenkins, Git Hub Actions, etc.).
- Hands‑on experience with IaC (Terraform, Ansible, Cloud Formation).
- Expertise with Docker, Kubernetes, and cloud platforms (AWS, Azure, GCP).
- Strong experience with Python and ML frameworks (Num Py, pandas, scikit‑learn, PyTorch, Tensor Flow).
- Experience with orchestration tools (Airflow, Kubeflow, Prefect, Dagster).
- Experience integrating security scanning, governance, and compliance frameworks into ML workflows.
- Strong scripting skills (Python, Bash, Go, or similar).
- Demonstrated leadership experience—technical mentorship, leading projects, or team oversight.
- Excellent communication skills with the ability to convey ML system behavior and trade‑offs to diverse stakeholders.
- Master’s degree in a relevant field.
- Additional security or cloud certifications (CISSP, AWS ML Specialty, CKA/CKS, etc.).
- Experience implementing Zero Trust, advanced observability (Prometheus, Grafana, ELK/EFK), or Open Telemetry.
- Experience with:
- Feature stores
- Data validation frameworks (Great Expectations)
- Data governance and lineage tooling
- Policy‑as‑code (OPA, Kyverno)
- Prior experience supporting defense, aerospace, or government‑secured AI/ML programs.
- Experience designing/operating mission‑critical AI/ML systems with high throughput, high availability, and rigorous monitoring.
- Competitive compensation & benefits
- Professional development & tuition assistance
- Collaborative, mission‑driven culture
- Direct impact on high‑visibility AI/ML government programs
Salary range starts at $138,337 with the potential for higher compensation based on experience, skills, and mission needs.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).