×
Register Here to Apply for Jobs or Post Jobs. X

Lead Infrastructure Engineer

Job in Redwood City, San Mateo County, California, 94061, USA
Listing for: HOAi
Full Time position
Listed on 2026-01-12
Job specializations:
  • IT/Tech
    AI Engineer, Systems Engineer, Cloud Computing, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 200000 - 250000 USD Yearly USD 200000.00 250000.00 YEAR
Job Description & How to Apply Below
Position: Lead Infrastructure Engineer (HOAi)

Lead Infrastructure Engineer (HOAi)

Location:

Wilmington, NC
• Redwood City, CA (Remote
• Remote) – Full‑time.

Description

HOAi is a fast‑growing startup revolutionizing the community association management industry. Our AI workforce platform integrates machine learning technology to streamline labor‑heavy processes, eliminating inefficiencies and driving scalability. We are pushing boundaries to redefine industry standards.

HOAi is the leading AI solution for the community association management industry, enabling organizations to deploy AI Agents that function like experienced managers. These AI Agents go beyond traditional AI by proactively executing complex, multi‑step processes with human‑like reasoning—working autonomously, 24/7, across your entire operation. This transformation optimizes labor costs, enables growth without additional hires, and ensures faster, higher‑quality service for residents and board members.

HOAi was acquired by Vantaca in the fall of 2024. Vantaca just achieved unicorn status with a $1.25B valuation, so we are no longer in the scrappy startup phase. We are building a category‑defining platform that will transform how an entire industry operates.

Here’s the reality of our trajectory:

  • Growing 100% year‑over‑year
  • Our AI product (HOAi) went from $0 to millions in months
  • Backed by Cove Hill Partners and JMI Private Equity
  • 6M+ doors on our platform, displacing legacy systems
Overview

The Lead AI Infrastructure Engineer at HOAi is responsible for scaling and maintaining the infrastructure that powers our AI‑driven products and services. This role sits at the intersection of infrastructure engineering, machine learning operations, and product development, ensuring our AI systems operate with exceptional reliability, performance, and efficiency. Your work will directly enable HOAi to deliver the most advanced AI product in the community association management industry.

Accountability

Key Initiatives
  • Infrastructure Ownership
    :
    Design, build, and maintain the cloud architecture, model serving infrastructure, and ML pipelines that power HOAi’s products
  • Performance Optimization
    :
    Profile and optimize AI workloads to achieve sub‑second inference latency while managing costs effectively
  • Scalability & Reliability
    :
    Build auto‑scaling systems, implement robust failover mechanisms, and ensure 99.99% uptime for mission‑critical AI services
  • MLOps Excellence
    :
    Develop and maintain CI/CD pipelines for model deployment, monitoring, and versioning across development and production environments
  • Developer Enablement
    :
    Create tooling and infrastructure that allows product engineers to deploy AI features quickly and safely
  • Security & Compliance
    :
    Implement security best practices and ensure compliance requirements are met across all AI infrastructure
Expectations for Success
  • Infrastructure uptime and reliability
  • AI inference latency (p95, p99) and throughput metrics
  • Infrastructure cost efficiency and optimization (cost per inference, GPU utilization)
  • Time to deploy new models and workflows (deployment velocity)
  • Developer satisfaction and productivity using AI infrastructure tools
  • System observability and incident response time
Responsibilities

Performance & Scalability

  • Profile and optimize database queries, API endpoints, and ML inference pipelines
  • Implement caching strategies, connection pooling, and distributed systems for scale
  • Monitor and optimize GPU utilization, memory usage, and compute costs
  • Design load balancing and auto‑scaling policies for variable AI workloads
  • Build disaster recovery systems with redundancy

MLOps & Deployment

  • Build & maintain CI/CD pipelines specifically for model deployment
  • Implement model versioning, A/B testing infrastructure, and rollout mechanisms
  • Create automated testing frameworks for model quality and performance regression
  • Develop infrastructure for model monitoring, drift detection, and retraining workflows
  • Manage experiment tracking and model registry systems

Observability & Reliability

  • Implement comprehensive monitoring, logging, and alerting across the AI stack
  • Refine dashboards for real‑time visibility into system health and performance
  • Conduct post‑mortems and implement reliability…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary