Senior ML/AI Engineer
Listed on 2026-03-02
-
Software Development
AI Engineer, Data Engineer, Machine Learning/ ML Engineer
About Poseidon
Poseidon is an a16z-backed startup building a platform that coordinates supply and demand for specialized AI training data. We work with Fortune 500 enterprises and leading AI labs to build and operationalize large-scale, rights-cleared multi-modal datasets and the models that learn from them.
The RoleWe are seeking a Senior ML/AI Engineer to lead the work of taking cutting‑edge ML (voice and beyond) from prototype → reliable systems → customer‑facing product. This is a senior, highly hands‑on role focused on building production‑quality model and data systems, owning technical direction for key components, and raising the bar on engineering rigor.
A small portion of time can be spent on applied research (e.g., new evaluation methods, fine‑tuning recipes, or model quality studies), but the core mandate is to ship
.
Own end‑to‑end delivery of ML capabilities into product: define the technical plan, implement, product ionize, and operate systems with clear quality, latency, and cost targets.
Build and scale evaluation for voice AI and other modalities:
- Design offline + online evaluation frameworks
- Create workflows for quality measurement and continuous improvement.
- Partner with product to translate metrics into product requirements and SLAs.
Lead fine‑tuning and adaptation work
:- Build and maintain pipelines for supervised fine‑tuning and domain adaptation.
- Own dataset curation, training data strategy, and reproducibility.
Engineer data and labeling systems that power learning loops:
- Design schemas/manifests across modalities and automate validation.
- Build data quality checks: PII detection, deduplication, drift checks, consensus labeling, gold sets.
Productionize model and pipeline infrastructure
:- Refactor research prototypes into tested Python libraries, services, and batch jobs.
- Deploy and operate inference endpoints (real‑time and batch)
- Optimize for GPU/CPU cost and performance
Raise engineering standards and mentor
:- Set best practices for testing, CI/CD, code review, documentation, and operational readiness.
- Mentor other engineers and help unblock cross‑functional execution with researchers, PMs, and ops.
6+ years of hands‑on experience shipping ML systems to production (or equivalent depth via impactful projects).
- Expert Python engineering skills, including writing maintainable libraries/services, tests, and performance‑aware code.
- Strong experience with modern deep learning frameworks (PyTorch strongly preferred).
- Proven track record owning production ML systems end‑to‑end, including:
- Data pipelines and training/evaluation workflows
- Deployment (APIs, batch jobs, or streaming inference)
- Observability (metrics, logs, traces), on‑call, and iterative reliability improvements
- Experience with voice AI / speech (ASR, diarization, audio preprocessing, alignment, multi‑speaker challenges).
- Strong understanding of ML evaluation and measurement (dataset design, slice‑based analysis, regressions, and statistical thinking).
- Solid cloud infrastructure experience (AWS, GCP, or Azure), containerization (Docker), and production deployment patterns. Kubernetes experience is a plus.
- Excellent communication: ability to write clear technical plans, make tradeoffs, and align stakeholders.
- Experience with multimodal systems (text + audio + image/video) and building unified data/eval abstractions.
- Experience with distributed training, GPU performance tuning, and large‑scale experimentation.
- Experience with workflow orchestration and distributed compute (Ray, Spark, Dask, Airflow, Flyte, Prefect).
- Familiarity with privacy, security, and compliance concerns in ML systems (PII, rights management, auditability).
- Python, PyTorch, FastAPI
- Docker, Kubernetes, Terraform
- AWS/GCP/Azure managed compute + storage
- ML tooling: MLflow or Weights & Biases, model registries, dataset/versioning tools
- Orchestration:
Airflow, Flyte, Prefect (or similar) - Observability:
Prometheus, Grafana, Open Telemetry, cloud‑native logging
- High leverage: your work will ship into products used by enterprises and leading AI labs.
- Real‑world ML: build systems that connect data → training → evaluation → deployment → feedback loops.
- Ownership: senior engineers here drive architecture and outcomes, not just tickets.
If you’re excited to turn state‑of‑the‑art voice + multimodal ML into reliable products, we’d love to hear from you.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).