Senior Cloud Infrastructure Engineer
Listed on 2026-01-02
-
IT/Tech
Cloud Computing, Systems Engineer
About LanceDB
Lance
DB is a developer-friendly, open-source data lake for multimodal AI. From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, Lance
DB is the best foundation for your AI application, and powers some of the most groundbreaking applications and challenging requirements today.
We’re seeking a seasoned Cloud Infrastructure Engineer with deep expertise in automation, infrastructure-as-code (IaC), and cloud platform management. You’ll design, deploy, and maintain robust cloud environments while collaborating with cross-functional teams to streamline CI/CD pipelines, enhance system reliability, and drive operational excellence.
As a Cloud Infrastructure Engineer at Lance
DB, your responsibilities will include:
Design & Build Cloud Infrastructure:
Architect and manage secure, scalable cloud environments (AWS, Azure, GCP) using IaC tools like Terraform and Cloud Formation.Automate Everything:
Develop and maintain automation scripts to streamline deployments, monitoring, and system operations.Systems Reliability:
Implement monitoring/alerting solutions (Prometheus, Grafana, Datadog) to proactively address performance bottlenecks and ensure 99.9% uptime.Security & Compliance:
Enforce security policies, manage secrets (Vault, AWS KMS), and ensure compliance with industry standards (GDPR, SOC2).Troubleshoot & Optimize:
Resolve complex infrastructure issues and lead cost-optimization initiatives for cloud resources.Collaborate & Mentor:
Partner with software engineering teams to integrate Dev Ops practices into SDLC and mentor junior engineers on IaC and cloud best practices.
10+ years in Dev Ops, Cloud Infrastructure, or SRE roles, with hands-on experience in public cloud platforms (AWS, Azure, GCP, Heroku).
Expertise in IaC tools (Puppet, Terraform, Ansible, Cloud Formation) and configuration management.
Experience designing and managing complex production environments using Kubernetes and Helm.
Deep understanding of networking, security, and cloud architecture best practices.
Experience with monitoring tools (Prometheus, Grafana) and logging systems (ELK, Splunk).
Strong knowledge of CI/CD tools (Git Hub Actions) and containerization (Docker, Kubernetes).
You like working with a small, high-caliber team with a lot of autonomy and drive, and you can iterate fast
You’ve made substantial contributions to open-source projects (e.g., Puppet modules, Terraform providers).
You design and automate single-command deployments for complex, globally distributed systems to ensure consistency, reliability, and scalability across multi-cloud or hybrid environments.
You fearlessly challenge the status quo and dismiss mediocre engineering as unacceptable.
You have worked on distributed large-scale system, with a good understanding of how to using tracing tool to identify bottlenecks.
DB team:
Lance
DB was created by experts with decades of experience building tools for data science and machine learning. From co-authors of pandas to Apache PMC of HDFS, Arrow, Iceberg and HBase, the Lance
DB team has created open source tools used by millions world-wide.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).