IT Operations Technical Lead
Listed on 2026-04-17
-
IT/Tech
Cloud Computing, Systems Administrator
Axle is a bioscience and information technology company that offers advancements in translational research, biomedical informatics, and data science applications to research centers and healthcare organizations nationally and abroad. With experts in biomedical science, software engineering, and program management, we focus on developing and applying research tools and techniques to empower decision‑making and accelerate research discoveries. We work with some of the top research organizations and facilities in the country including multiple institutes at the National Institutes of Health (NIH).
BenefitsWe Offer
- Paid Time Off and Paid Holidays
- 401K match up to 5%
- Educational Benefits for Career Growth
- Employee Referral Bonus
- Flexible Spending Accounts
- Healthcare (FSA)
- Parking Reimbursement Account (PRK)
- Dependent Care Assistant Program (DCAP)
- Transportation Reimbursement Account (TRN)
Axle is seeking an IT Operations Technical Lead to oversee and optimize hybrid cloud and on‑premise infrastructure. This role combines hands‑off technical leadership with ITIL‑based operations, automation, and incident management. The ideal candidate brings deep Linux expertise and a focus on reliability, scalability, and modern workloads like AI/ML.
ResponsibilitiesLead and manage IT operations aligned with ITIL processes including Incident, Problem, Change, and Release Management
Provide hands‑off leadership in managing Linux and Windows environments across cloud and on‑premises infrastructure
Own and drive incident response, root cause analysis, and service restoration for mission‑critical systems
Design, build, and maintain golden images, patching strategies, and system hardening standards
Lead patch management and vulnerability remediation programs ensuring compliance and system integrity
Develop and implement automation solutions using modern approaches including Vibe Coding (AI‑assisted development) to accelerate operational efficiency and reduce toil
Support and optimize infrastructure for AI/ML workloads, including provisioning, scaling, and performance tuning
Manage and maintain GPU‑enabled environments and instances for high‑performance computing and machine learning use cases
Oversee and optimize infrastructure monitoring, logging, alerting, and observability frameworks
Manage and mentor a team of systems engineers; provide technical guidance and performance oversight
Collaborate with architecture, security, and development teams to improve reliability, scalability, and operational efficiency
Support hybrid environments including cloud platforms and on‑premise data centers
Ensure proper documentation, runbooks, SOPs, and operational readiness
Stay abreast of new technologies in your areas but not limited to US Federal Standards, NIST Publications, cloud computing & deployment, site reliability engineering, security standards and compliance best practices etc.
RequirementsMust have 5+ years of experience leading operations team with hands‑off experience in driving operational process improvements and technological advancements.
Proven experience implementing and operating within ITIL frameworks
Must have 10+ years of hands‑off Unix/Linux experience that includes specific technical experience with CentOS / Red Hat systems administration support for large scale distributed environments
Hands‑off experience with incident management, patching, system hardening, and production support
Experience building and maintaining golden images and standardized environments
Strong scripting/automation skills (e.g., Python, Bash, Power Shell or similar)
Experience with configuration management and automation tools (Ansible, Terraform, Puppet, Chef, or similar)
Strong understanding of networking fundamentals (DNS, TCP/IP, firewalls, load balancing)
Experience with monitoring and logging tools (e.g., Nagios, Splunk, ELK, Prometheus, Grafana)
Must have Cloud Build‑Out or Migration experience in at least one of the following providers Amazon AWS, Google GCP and Microsoft Azure
Must have 2+ years with CI/CD and automation tools such as Terraform, Ansible, Chef, Puppet, Jenkins, Git Hub
Experience supporting AI/ML workloads or data‑intensive platforms
Famili…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).