Cloud Storage Test Engineer
Listed on 2026-05-16
-
Engineering
Systems Engineer, Data Engineer, Software Engineer
Job Description
The Storage Test Engineer will serve as a hands‑on technical contributor within Western Digital's System Integration & Test (SIT) lab, focused on deploying, operating, and validating clustered storage environments that accurately reflect real‑world cloud and datacenter infrastructure. Working closely with the Clustered Storage Test team and HDD System Test groups, this engineer will configure scale‑out storage clusters, design and execute various workloads and SLA validation tests, measure and analyze system and drive performance, and monitor drive health telemetry.
This role is ideal for an engineer with 2–6 years of experience in storage systems or infrastructure who is ready to go deep on distributed storage behavior, performance characterization, and large‑scale fleet operations.
- Deploy and operate scale‑out storage environments — including multi‑node object storage clusters, erasure‑coded pools, and rack‑level storage deployments — configured to emulate real customer datacenter architectures at lab scale.
- Measure and analyze storage system and drive performance — characterizing throughput, IOPS, and latency distributions across workload profiles including AI/ML data access patterns, backup and restore traffic, cloud object storage, and web‑serving workloads — and producing clear performance reports with root‑cause observations.
- Identify and investigate performance anomalies by correlating cluster‑level metrics with drive‑level telemetry, distinguishing between platform bottlenecks, configuration issues, and drive‑level contributors to throughput or latency degradation.
- Analyze technical customer workload studies and translate key variables into realistic test scenarios and validation methodologies.
- Instrument and monitor drive health telemetry — collecting and trending SMART attributes, Error Logs, I/O error rates, and temperature data — to identify early failure indicators and correlate drive health signals with observed cluster performance during long‑run experiments.
- Contribute to fleet management processes for the lab's HDD inventory, supporting tracking of drive state, physical location, test assignment, and lifecycle status across large populations of test drives.
- Assist with datacenter infrastructure management tasks including rack‑level storage deployments, drive swap procedures, firmware staging, and hardware validation during active cluster tests.
- Document test configurations, performance results, and failure analyses in detailed technical write‑ups that allow peer engineers to reproduce experiments, compare results across configurations, and build on prior findings.
- Collaborate with senior engineers to identify gaps between existing test methodologies and modern cloud storage deployment patterns, then help design and implement updated test scenarios, workload mixes, or cluster configurations that close those gaps.
- Evaluate storage benchmarking tools and workload generators, assessing their suitability for emulating specific customer traffic profiles and recommending configurations that produce the most representative and actionable results.
REQUIRED
- Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field.
- 2–6 years of hands‑on experience in storage systems, distributed infrastructure, or storage test and validation engineering.
- Familiarity with scale‑out storage architectures and an understanding of how distributed object or block storage clusters are structured, scaled, and managed in production environments.
- Exposure to S3‑compatible object storage (MinIO, AWS S3, or equivalent), including operational concepts such as bucket management, erasure coding, and performance benchmarking.
- Strong Linux command‑line proficiency; comfort working in a physical lab environment with servers, HBAs, JBODs, and storage controllers.
- Solid scripting ability (Python, Bash, or similar) sufficient for automating test execution, parsing results, and building lightweight data‑processing utilities.
PREFERRED
- Familiarity with drive health monitoring and telemetry — including SMART attribute…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).