Principal Applied Machine Learning & Systems Engineer Job Denver area,Colorado USA,IT/Tech

Lux Tronic builds AI-powered inspection and production monitoring systems to help manufacturers run smarter.

About the Role

We are seeking a Principal Applied Machine Learning & Systems Engineer to design, deploy, and operate production-grade ML systems in real industrial environments. This role spans edge, on-prem, and cloud ML, where reliability, latency, and uptime matter more than offline benchmarks. At the Principal level, you will define architecture, technical standards, and long-term ML strategy across deployments. This role is deeply hands‑on and requires comfort working under real-world constraints such as sensor noise, environmental variability, and mission‑critical uptime.

Location: Remote (Mountain Time preferred)

Schedule: Flexible hours; ~50–60 hours/week

Travel: Quarterly on-site visits to industrial facilities (factories, plants)

Responsibilities

Design and implement ML models for industrial use cases, including predictive maintenance, anomaly detection, quality inspection, and process optimization.
Build models resilient to noisy, incomplete, and high‑variance industrial data.
Develop using modern ML frameworks (PyTorch, Tensor Flow, ONNX) and deploy across:
- Edge and embedded systems
- On‑prem industrial servers
- Cloud and hybrid infrastructure
Implement fail‑safes, fallback logic, and degradation strategies for mission‑critical systems.

Production Engineering & Infrastructure

Deploy ML systems using Docker, Kubernetes, and CI/CD pipelines (Git Hub Actions).
Build and maintain real‑time and batch inference pipelines with strict reliability and latency requirements.
Integrate ML services with industrial control systems (PLCs, SCADA, edge controllers).
Develop secure, low‑latency APIs to enable ML integration in industrial environments.

Optimization for Industrial Constraints

Optimize models for performance and efficiency using quantization, pruning, and edge‑optimized inference runtimes.
Balance accuracy, throughput, and resource constraints across heterogeneous hardware.
Ensure sub‑second decision‑making where required by industrial processes.

Monitoring, Reliability & Troubleshooting

Monitor deployed models for drift, degradation, and infrastructure issues.
Build dashboards and alerts using Grafana or similar tools.
Troubleshoot live production issues involving hardware, networking, data quality, and model behaviour with minimal operational impact.
Partner with industrial engineers and operations teams to translate factory requirements into ML solutions.
Participate in quarterly on‑site visits to assess deployment environments and optimise systems in place.

Extreme Ownership

Own ML systems from design through long‑term operation.
Anticipate failure modes and proactively mitigate risk.
Deliver high‑quality outcomes under real‑world constraints and tight timelines.

Principal Level Additional Responsibilities

Define ML architecture and deployment patterns across multiple industrial sites.
Establish best practices for model lifecycle management, deployment, and monitoring.
Lead technical trade‑offs between accuracy, latency, reliability, and cost.
Review designs and implementations across multiple ML initiatives.
Mentor senior engineers and raise overall engineering standards.
Act as technical authority during high‑severity production incidents.
Other duties as assigned.

Qualifications

Strong Python expertise for ML and production systems.
Deep experience with ML frameworks (PyTorch, Tensor Flow, ONNX).
Proven experience deploying ML in industrial, edge, or embedded environments.
Experience with Docker, CI/CD pipelines, and Git Hub Actions.
Proficiency with Ubuntu/Linux and Bash scripting.
Experience building APIs using AWS services (API Gateway, Lambda, Sage Maker).
Familiarity with industrial protocols (Modbus, OPC UA) and factory systems (PLCs, SCADA).
Experience monitoring production systems using Grafana or similar tools.
Strong real‑time debugging and problem‑solving skills.
Willingness to travel quarterly and sustain a demanding workload.

Required Skills

AWS IoT Core / Green grass experience.
Edge inference optimisation (Tensor

RT, OpenVINO, Jetson).
Prior experience in manufacturing, robotics, or industrial automation.

Preferr…


Increase/decrease your Search Radius (miles)



Job Posting Language