Principal Software Engineer,Data Platform Job San Francisco area,California USA,Software Development

Description Principal Software Engineer, Data Platform

The Principal Member of Technical Staff for the Enterprise Data Platform is the primary technical architect responsible for modernizing, integrating, and optimizing Salesforce's foundational data ecosystem. You will serve as the technical "north star" for the engineering teams, bridging the world of modern distributed analytics with the cutting edge of Semantic AI.

In this high-impact individual contributor role, you will architect the backbone of the company using technologies such as Snowflake, dbt, Informatica, and Airflow—while simultaneously designing and scaling our advanced Knowledge Graph Platform (Neo4j & Top Quadrant). Your mission is to design the "paved path" where structured data flows effortlessly into high-value knowledge graphs to power BI, Advanced Analytics, and Generative AI.

You will not just oversee the architecture; you will write the proof-of-concepts, define the code standards, and solve the most complex scalability challenges.

Key Responsibilities

Technical Strategy & Platform Architecture

Architect the

Roadmap:

Define the long-term technical architecture for the Enterprise Data Platform. Translate business strategy into technical specifications, ensuring our stack allows for "Data Mesh" scalability and domain-oriented ownership.
Infrastructure as Code (IaC) Evangelism: personally architect and review the Terraform/Helm configurations that define our infrastructure. Ensure that from Snowflake RBAC to Neo4j clusters, our platform is immutable, version-controlled, and reproducible.
Performance Engineering:
Deep dive into the hardest performance bottlenecks. Optimize query planners, data serialization formats (Parquet/Iceberg), and distributed compute costs across Snowflake and Spark.
AI Enablement:
Design the integration patterns for AI-assisted tooling (Cursor, MCP, Copilot) within the developer workflow to step-change developer velocity.

Knowledge Graph & Semantic Engineering

Graph RAG Architecture:
Lead the technical design of "Graph RAG" (Retrieval-Augmented Generation), creating the patterns that allow LLM agents to query structured Snowflake data via the Neo4j Knowledge Graph.
Semantic Layer Design:
Design the integration between the physical data layer (Snowflake) and the semantic governance layer (Top Quadrant/Top Braid EDG), ensuring ontologies are mechanically enforced rather than theoretically defined.
Polyglot Persistence: define the specific architectural patterns for when data should reside in a Relational Store (Snowflake) versus a Graph Store (Neo4j), and design the high-velocity pipelines (Kafka/Airflow) that keep them in sync.

Engineering Standards & Technical Influence

Code Quality & Dev Ops:
Set the standard for code quality. You will be expected to code, review Pull Requests, and enforce strict CI/CD pipelines (unit testing data, schema validation).
Resiliency Architecture:
Design self-healing systems. Architect the monitoring and alerting frameworks (SRE) that ensure 99.9% availability for critical pipelines.
Mentorship without Authority:
Act as a technical mentor to Senior and Lead engineers across multiple squads. Elevate the technical bar of the organization through design reviews, RFCs, and pair programming sessions.

What We’re Looking For

10+ years of software engineering experience, with at least 5 years focused on backend distributed systems or data infrastructure at scale.
Deep Engineering Roots:
You are an expert coder (Python, Java, or Go) who grew up building software. You are comfortable debugging a distributed trace, optimizing a JVM heap, or rewriting a slow SQL query plan.
Architectural Expertise:
Proven track record of designing large-scale data platforms. You understand the CAP theorem, eventual consistency, and the trade-offs between batch and streaming architectures.
Core Stack Mastery:
Hands‑on expert‑level knowledge of Snowflake (internals/clustering), dbt (macro design/Jinja), Airflow (scheduler internals), and Tableau.
Graph Database Expertise:
Deep understanding of Graph theory and implementation. You know how to model data in Neo4j (Cypher) to avoid super‑node problems and optimize traversal performance.
Cloud Native Native:
Mastery of AWS/GCP services (IAM, VPC, Private Link, S3/GCS) and container orchestration (Kubernetes/EKS).
AI/LLM Integration:
Experience implementing RAG architectures, vector databases, or integrating LLMs into data pipelines.
Influence & Communication:
Ability to write clear, persuasive Request for Comments (RFCs) and architectural decision records (ADRs) that drive consensus among other architects and engineering leadership.

For roles in San Francisco and Los Angeles:
Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.

#J-18808-Ljbffr