Data Engineer Job Irvine area,California USA,IT/Tech

Location and Eligibility

Onsite in Irvine, California. Preference is given to candidates already local to the area; relocation is not funded. Only US Citizens or Green Card holders are eligible for this position.

Responsibilities

Design and implement batch and streaming pipelines in Apache Spark running on Kubernetes and Kubeflow Pipelines to hydrate feature stores and training datasets.
Build high‑throughput ETL/ELT jobs with SSIS, SSAS, and T‑SQL against MS SQL Server, applying Data Vault style modeling patterns for auditability.
Integrate source control, build, and release automation using Git Hub Actions and Azure Dev Ops for every pipeline component.
Instrument pipelines with Prometheus exporters and visualize SLA, latency, and error budget metrics to enable proactive alerting.
Create automated data quality and schema drift checks; surface anomalies to support a rapid incident response process.
Use MLflow Tracking and Model Registry to version artifacts, parameters, and metrics for reproducible experiments and safe rollbacks.
Work with data scientists to automate model retraining and deployment triggers within Kubeflow based on data freshness or concept drift signals.
Develop Power Shell and .NET utilities to orchestrate job dependencies, manage secrets, and publish telemetry to Azure Monitor.
Optimize Spark and SQL workloads through indexing, partitioning, and cluster sizing strategies, benchmarking performance in CI pipelines.
Document lineage, ownership, and retention policies; ensure pipelines conform to PCI/SOX and internal data governance standards.

Qualifications

At least 6 years of experience building data pipelines in Spark or equivalent.
At least 2 years of experience deploying workloads on Kubernetes/Kubeflow.
At least 2 years of experience with MLflow or similar experiment‑tracking tools.
At least 6 years of experience in T‑SQL, Python/Scala for Spark.
At least 6 years of Power Shell/.NET scripting.
At least 6 years of experience with Git Hub, Azure Dev Ops, Prometheus, Grafana, and SSIS/SSAS.
Certifications such as Kubernetes CKA/CKAD, Azure Data Engineer DP‑203, or MLOps‑focused certificates (e.g., Kubeflow or MLflow) are a plus.
Mentoring engineers on best practices in containerized data engineering and MLOps.

#J-18808-Ljbffr