Infrastructure & DevOps Engineer Job Penn Yan area,Town of Italy New York USA,IT/Tech

Position: + Infrastructure & DevOps Engineer
Location: Town of Italy

About ALLSIDES

ALLSIDES is redefining how the world experiences 3D content. We combine physically accurate scanning and generative AI to power content creation workflows for e-commerce, virtual environments, and immersive experiences. Our clients include global brands like adidas, Meta, Amazon, and Zalando.

We operate a rapidly scaling photorealistic 3D scanning operation, capturing tens of thousands of assets annually while training next‑generation AI models. As an NVIDIA Inception member, we collaborate with leading research institutions and actively participate in top‑tier conferences in 3D computer vision and AI.

More info: https://(Use the "Apply for this Job" box below). |

Position Overview

We're looking for an Infrastructure & Dev Ops Engineer to build and maintain the foundation of our compute infrastructure. You'll work on hardware provisioning, networking, container orchestration, and deployment pipelines across cloud and on‑premise environments. This role focuses on making our multi‑GPU clusters reliable, our deployments reproducible, and our developers productive.

Main Responsibilities

Provision, configure, and maintain heterogeneous compute clusters (CPU/GPU) across multiple physical locations
Implement dynamic compute and storage provisioning based on workload demands
Design storage solutions at both hardware and software levels (NAS, distributed file systems, storage tiering)
Implement and manage container orchestration systems (Kubernetes, Docker) for development and production workloads
Design and maintain infrastructure as code using tools like Terraform and Ansible
Build and optimize job scheduling and resource allocation systems (Slurm, Kubernetes)
Set up monitoring, alerting, and observability infrastructure (Prometheus, Grafana, IPMI)
Profile and optimize system‑level performance: GPU utilization, memory bandwidth, I/O throughput, network latency
Manage networking, VPNs, and secure access across distributed systems
Handle reliability concerns: hardware failure detection, job checkpointing, disaster recovery

Qualifications

Strong Linux system administration knowledge
Experience with containerization (Docker) and orchestration (Kubernetes)
Knowledge of infrastructure as code (Terraform, Ansible)
Experience with HPC clusters and job scheduling (Slurm)
Familiarity with monitoring solutions (Prometheus, Grafana)
Understanding of networking principles and implementation
Experience with hardware infrastructure management (IPMI, BMC, server maintenance)
Knowledge of storage systems design (NFS, Ceph, distributed file systems)

Nice to Have

Experience with cloud services (AWS, or others)
Familiarity with bare‑metal provisioning (MaaS)

What we offer

Compensation that reflects your experience including stock‑options
Lunch voucher for working days
We assist with relocation
Flexible working hours and work‑from‑home policy
Family‑friendly environment
Amazing office space in South Tyrol, located at the Durst Group>
Personal and professional growth opportunities

You don't have to tick every box to apply, your drive and passion matter most!

This role is located on‑site in Brixen/Bressanone, Italy. If you are interested, please apply with your CV attached to careersh

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language