Senior Backend Engineer
Listed on 2026-06-04
-
Software Development
Cloud Engineer - Software, DevOps
Build and Deploy AI the right way, anywhere.
The FlexAI Compute Infrastructure Platform provides an "end-to-end AI compute layer" for running and managing workloads across any cloud, any GPU, and any deployment model (public, hybrid, or on-prem). It brings together "1-click simplicity" for users with "enterprise-grade orchestration, security, and automation" under the hood.
Founded by Brijesh Tripathi, who brings experience from Nvidia, Apple, Tesla, Intel and Zoox, FlexAI is not just building a product – we’re shaping the future of AI. Our teams are strategically distributed across Silicon Valley and Bengaluru, united by a shared mission: to deliver more compute with less complexity.
If you're passionate about shaping the future of artificial intelligence, driving innovation, and contributing to a sustainable and inclusive AI ecosystem,
FlexAI is the place for you!
FlexAI is looking for a Senior Backend Engineer (Infrastructure & AI Platform) with deep Golang expertise to architect and build the core backend systems powering our next-generation AI compute and PaaS platform. This role sits at the intersection of distributed systems, cloud infrastructure, and AI platform engineering — enabling large-scale model training, inference, and orchestration across heterogeneous compute. This is not a traditional backend role;
you will be building platform-grade systems that support AI runtimes, scheduling, resource orchestration, and multi-tenant cloud infrastructure.
As a Senior Backend Engineer
, you'll drive backend architecture, scale platform services, and build high-performance infrastructure components that power AI workloads in production environments — influencing how the platform evolves from Beta to enterprise-grade deployment. Expect high ownership and technical autonomy in a research-driven, deep-tech environment — not SaaS CRUD apps.
This position is In-Person and located at our San Jose, CA Office.
What You'll DoCore Platform & Infrastructure Backend:
- Architect and develop high-performance Golang services for FlexAI's AI PaaS and infrastructure platform
- Build internal APIs powering model deployment, job scheduling, and compute lifecycle management
- Develop components interfacing with GPU/compute infrastructure and AI runtimes
- Design and scale microservices and event-driven systems for high-throughput AI workloads
- Optimize for low latency, high concurrency, and fault tolerance
- Drive reliability, observability, and resilience across services
- Collaborate with AI/ML and Runtime teams to integrate systems with training pipelines, inference infrastructure, experimentation workflows, and dataset/artifact management
- Enable orchestration across cloud and on-prem environments
- Build abstractions that simplify AI infrastructure consumption
- Work with Dev Ops/SRE on CI/CD, deployment automation, and scalability
- Contribute to architecture decisions for multi-region, multi-cloud infrastructure
- Improve monitoring, logging, and diagnostics
- Lead architecture reviews and set engineering standards
- Mentor engineers and guide complex problem-solving
- Drive long-term roadmap for backend infrastructure and AI platform capabilities
- Partner with Product, Runtime, and Infra leadership to translate requirements into scalable systems
- Data: SQL + No
SQL databases, caching, streaming systems - Observability:
Prometheus, Grafana, Open Telemetry (or similar)
Core Engineering:
- 5+ years of Backend or Infrastructure Engineering experience
- Expert-level proficiency in Golang (must-have, heavy hands-on)
- Strong experience building production-grade distributed systems
- Proven track record on infrastructure platforms, PaaS, or deep-tech systems
- Deep understanding of cloud-native architectures and containerized environments
- Strong experience with Kubernetes, Docker, and cluster orchestration
- Familiarity with compute scheduling, resource management, or platform runtimes is a strong plus
- Experience with distributed databases (Postgre
SQL, Cassandra, Dynamo
DB, etc.) - Strong…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).