Infrastructure Engineer L4
Listed on 2026-02-20
-
IT/Tech
Cloud Computing, Systems Engineer
Overview
We are seeking a highly skilled CPI Hands-On Networking Engineer with deep expertise in PromQL
, cloud monitoring, and infrastructure observability. This role requires someone who can design, build, and optimize custom monitoring solutions, create advanced PromQL queries—including complex and unconventional joins—and support large-scale cloud and networking environments with precision.
This is a hands‑on engineering role ideal for someone who loves solving difficult observability problems, working across infrastructure layers, and enabling high reliability in distributed systems.
Key Responsibilities- Design and develop custom PromQL queries to support detailed performance insights, anomaly detection, and real‑time monitoring of infrastructure and cloud services.
- Build creative, non‑standard PromQL joins across distinct metric sets to power advanced dashboards, SLO reporting, and incident analysis.
- Architect, tune, and maintain monitoring solutions using Prometheus, Grafana, and related observability tools.
- Partner with networking, systems, and platform teams to ensure complete coverage of infrastructure‑level metrics across compute, storage, and networking.
- Develop custom metrics and instrumentation to enrich visibility into distributed systems and microservices.
- Perform in‑depth root cause analysis leveraging metrics, logs, traces, and topology insights.
- Support cloud‑based, containerized, and hybrid environments, ensuring reliable data pipelines and metric ingestion paths.
- Drive monitoring standards, dashboards, alerting strategies, and best practices across engineering teams.
- Contribute to performance tuning, capacity planning, and incident response processes.
- Expert‑level ability to write custom, complex, and highly optimized PromQL queries.
- Proven experience executing creative joins across disparate Prometheus metrics.
- Strong hands‑on background with Prometheus, Grafana, Alert manager, and related cloud monitoring ecosystems.
- Ability to perform metric design, normalization, and effective labeling strategies.
- Strong understanding of infrastructure telemetry (CPU, memory, storage I/O, network traffic, container metrics, node/pod health, etc.).
- Solid foundation in cloud networking technologies: routing, subnets, firewalls, load balancers, DNS, service discovery.
- Hands‑on knowledge of Linux systems, performance troubleshooting, and networking diagnostics.
- Experience with incident response, SLO/SLA methodologies, and reliability‑focused engineering.
- Familiarity with CI/CD workflows, infrastructure‑as‑code, and cloud automation.
- Ability to collaborate with cross‑functional engineering teams and communicate complex monitoring insights clearly.
The pay range that the employer in good faith reasonably expects to pay for this position is $49.83/hour - $77.86/hour. Our benefits include medical, dental, vision, and retirement benefits. Applications will be accepted on an ongoing basis.
Tundra Technical Solutions is among North America’s leading providers of Staffing and Consulting Services. Our success and our clients’ success are built on a foundation of service excellence. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.
Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Unincorporated LA County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: client provided property, including hardware (both of which may include data) entrusted to you from theft, loss or damage;
return all portable client computer hardware in your possession (including the data contained therein) upon completion of the assignment, and; maintain the confidentiality of client proprietary, confidential, or non-public information. In addition, job duties require access to secure and protected client information technology systems and related data security obligations.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).