Senior Software Engineer - Systems; CLI Job Bangalore area,Bengaluru Karnataka India,IT/Tech

Position: Senior Software Engineer - Systems (CLI)
Location: Bengaluru

Harness is the AI Software Delivery Platform company, led by technologist and entrepreneur Jyoti Bansal (founder of App Dynamics, acquired by Cisco for $3.7B). Harness has raised approximately $570M in funding and is valued at $5.5B, backed by leading investors including Goldman Sachs, Menlo Ventures, IVP, Unusual Ventures, Citi Ventures, and more. As AI accelerates code creation, the real bottleneck has shifted to everything after the code – testing, deployments, application security, reliability, compliance, and cost optimization.

Harness brings AI and automation to this 'outer loop,' helping teams ship software faster while maintaining security and governance throughout the entire software delivery lifecycle.

Powered by Harness AI and the Software Delivery Knowledge Graph, the Harness Platform applies deep context and intelligent automation across the software delivery lifecycle with governance and policy-driven controls embedded throughout the platform.

Over the past year, Harness powered over 185M deployments, 82M builds, 18T flag evaluations, 8M security scans, 9.1B optimized tests, 3T protected API calls, and helped manage $2.8B in cloud spend — enabling customers like United Airlines, Morningstar, and Choice Hotels to accelerate releases by up to 75%, reduce cloud costs by up to 60%, and achieve 10x Dev Ops efficiency.

With a global team across 26 offices and 27 countries, Harness is shaping the future of AI software delivery — and we're looking for exceptional talent to help us move even faster.

Position Summary

As a Senior Cloud Engineer at Harness, you will play a pivotal role in designing, building, and maintaining our cloud infrastructure. You will be responsible for ensuring the reliability, scalability, and performance of our systems, incorporating a blend of Cloud Engineering and Site Reliability Engineering (SRE) practices. This role requires a strong technical background, a passion for innovation, and the ability to work collaboratively in a fast-paced environment.

Key Responsibilities

Cloud Infrastructure Design & Implementation:

Design, build, and manage scalable, secure, and reliable cloud infrastructure using GCP, AWS or Azure.
Develop infrastructure-as-code using tools such as Terraform, Cloud Formation, or similar.

Site Reliability Engineering (SRE):

Implement SRE practices to ensure the reliability, availability, and performance of cloud services.
Develop and maintain monitoring, logging, and alerting systems to detect and address issues proactively.
Perform capacity planning and demand forecasting to ensure system scalability and performance.

Automation & CI/CD:

Deploy, manage, and scale applications using Kubernetes (K8s).
Utilize Helm for packaging, deploying, and managing applications on Kubernetes.
Design and implement continuous integration and continuous deployment (CI/CD) pipelines to automate the delivery of applications and infrastructure.
Develop automation scripts and tools to streamline operations and improve efficiency.

Security & Compliance:

Ensure cloud infrastructure and applications meet security and compliance standards.
Implement security best practices and perform regular security audits and assessments.

Collaboration & Mentorship:

Collaborate with cross-functional teams including developers, product managers, and operations to deliver high-quality solutions.
Mentor and guide junior engineers, sharing best practices and fostering a culture of continuous improvement.

About You

Technical Expertise:

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.
4+ years of experience in cloud engineering, site reliability engineering, or related roles.
Strong experience with cloud platforms (AWS, GCP, Azure) and cloud-native services.
Proficiency in infrastructure-as-code tools (Terraform), Helm package manager and configuration management tools (Ansible, Chef, Puppet)

Experience with AI-OPS

SRE Practices:

Experience with SRE principles, including error budgets, SLIs, SLOs, and incident management.
Strong knowledge of monitoring and observability tools (Prometheus, Grafana, GCM).

Automation & Dev Ops:

Expertise in building and managing CI/CD…