More jobs:
Machine Learning Engineer
Remote / Online - Candidates ideally in
Orlando, Orange County, Florida, 32885, USA
Listed on 2026-02-15
Orlando, Orange County, Florida, 32885, USA
Listing for:
Integrated Resources, Inc ( IRI )
Remote/Work from Home
position Listed on 2026-02-15
Job specializations:
-
IT/Tech
Data Analyst, AI Engineer
Job Description & How to Apply Below
Job Title:
- Artificial Intelligence/Machine learning Engineer
Location:
- PARTIALLY REMOTE JOB - Local Candidates Required Near to Orlando, FL 32819 / Glendale, CA (91201) / Anaheim, CA (92802) / Seattle, WA (98104) Duration:
- 22 Months (Possible for extension) AI/ML Operations
- Manage operational workflows for model deployments, updates, and versioning across GCP, Azure, and AWS
- Monitor model performance metrics: latency, throughput, error rates, token usage, and inference quality
- Track model drift, accuracy degradation, and performance anomalies—escalating to engineering as needed
- Support knowledge base operations including vector embedding pipeline health, chunk quality, and refresh cycles in Vertex AI
- Maintain model inventory and documentation across multi-cloud environments
- Coordinate model evaluation cycles with Responsible AI and Core Engineering teams
- Monitor AI agent health, performance, and reliability (Auto Gen-based agents, MCP servers)
- Track agent execution metrics: task completion rates, tool call success/failure, latency, and error patterns
- Support agent deployment and configuration management workflows
- Document agent behaviors, known issues, and operational runbooks
- Coordinate with Core Engineering on agent updates, testing, and rollouts
- Monitor MCP server availability, connection health, and integration status
- Track and analyze AI/ML cloud spend across GCP (Vertex AI), Azure (OpenAI), and AWS (Bedrock)
- Build cost dashboards with breakdowns by model, application team, use case, and environment
- Monitor token consumption, inference costs, and embedding/storage costs
- Identify cost optimization opportunities—model selection, caching, batching, rightsizing
- Provide cost allocation reporting for chargeback/showback to consuming application teams
- Forecast spend trends and flag budget anomalies
- Partner with Infrastructure and Finance teams on AI cost governance
- Build and maintain dashboards for platform performance, model health, agent metrics, and operational KPIs
- Create executive and stakeholder reports on platform adoption, usage trends, and cost allocation
- Develop Responsible AI dashboards tracking hallucination rates, accuracy metrics, guardrail triggers, and safety incidents
- Monitor APIGEE gateway traffic patterns and API consumption trends
- Provide regular reporting to product management on use case performance
- Support release management processes with pre/post-deployment validation checks
- Track release health metrics for models, agents, and platform components
- Maintain release documentation, runbooks, and operational playbooks
- Coordinate with QA, Performance Engineering, and Infrastructure teams during releases
- Monitor guardrail effectiveness and flag anomalies to the Responsible AI team
- Track and report on hallucination detection, content safety triggers, and accuracy trends
- Support LLM Red Teaming efforts by collecting and organizing evaluation data
- Maintain audit logs and compliance documentation for AI governance
- Serve as operational point of contact for application teams consuming DxT AI APIs
- Coordinate with Corporate Security on audit requests and compliance reporting
- Partner with Infrastructure team on capacity tracking and resource utilization
- Support Performance Engineering with load test analysis and results documentation
-
- 2-4 years in an Ops, Analytics, or Technical Operations role (MLOps, AIOps, Data Ops, Platform Ops, or similar)
- Understanding of AI/ML concepts: models, inference, embeddings, vector databases, LLMs, tokens, prompts
- Experience with cloud cost management and Fin Ops—tracking, analyzing, and optimizing cloud spend Strong proficiency with dashboarding and visualization tools (Looker, Tableau, Grafana, or similar)
- Working knowledge of GCP (required); familiarity with Azure and AWS a plus
- Comfortable with SQL and basic Python for data analysis and scripting
- Experience with monitoring and observability platforms (Datadog, Prometheus/Grafana, Cloud Monitoring, or similar)
- Understanding of APIs and API gateways ability to read logs, trace…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×