Sr. Data Architect
Listed on 2025-12-10
-
IT/Tech
Data Engineer, Cloud Computing, Data Science Manager
Job Description
Hybrid role 3X a week in office in Elk Grove, CA; no remote capabilities.
Position SummaryThe Senior Data Architect is a senior technical leader responsible for building and optimizing a robust data platform in the automotive industry. In this full‑time role, you will lead a team of data engineers and own the end‑to‑end architecture and implementation of the Databricks Lakehouse platform. You will collaborate closely with function leaders, domain analysts and other stakeholders to design scalable data solutions that drive business insights.
This position demands deep expertise in Databricks (GCP), and ability to build end‑to‑end data pipelines that handle large volumes of structured, semi‑structured and unstructured data. You will demonstrate strong leadership to ensure best practices in data engineering, performance tuning, and governance. You will be expected to communicate complex technical concepts and data strategies to technical and non‑technical audiences, including executive leadership.
Responsibilities – Other Duties May Be Assigned
- Lead, mentor, and manage a team of data engineers, providing technical guidance, code reviews, and fostering a high‑performing team.
- Own the Databricks platform architecture and implementation, ensuring the environment is secure, scalable, and optimized for the organization’s data processing needs. Design and oversee the Lakehouse architecture leveraging Delta Lake and Apache Spark.
- Implement and manage Databricks Unity Catalog for unified data governance. Ensure fine‑grained access controls and data lineage tracking are in place to secure sensitive data.
- Collaborate with analytics teams to develop and optimize Databricks SQL queries and dashboards. Tune SQL workloads and caching strategies for faster performance and ensure efficient use of the query engine.
- Lead performance tuning initiatives. Profile data processing code to identify bottlenecks and refactor for improved throughput and lower latency. Implement best practices for incremental data processing with Delta Lake, and ensure compute cost efficiency.
- Work closely with domain analysts, data scientists and product owners to understand requirements and translate them into robust data pipelines and solutions. Ensure that data architectures support analytics, reporting, and machine learning use cases effectively.
- Integrate Databricks workflows into the CI/CD pipeline using Dev Ops principles and Git. Develop automated deployment processes for notebooks and jobs to promote consistent releases. Manage source control for Databricks code (using Git Lab) and collaborate with Dev Ops engineers to implement continuous integration and delivery for data projects.
- Collaborate with security and compliance teams to uphold data governance standards. Implement data masking, encryption, and audit logging as needed, leveraging Unity Catalog and GCP security features to protect sensitive data.
- Stay up to date with the latest Databricks features and industry best practices. Proactively recommend and implement improvements (such as new performance optimization techniques or cost‑saving configurations) to continuously enhance the platform’s reliability and efficiency.
- 10+ years of experience in data engineering, data architecture, or related roles, with a track record of designing and deploying data pipelines and platforms at scale.
- Significant hands‑on experience with Databricks (preferably GCP) and the Apache Spark ecosystem. Proficient in building data pipelines using PySpark/Scala and managing data in Delta Lake format.
- Strong experience working with cloud data platforms (GCP preferred, or AWS/Azure). Familiarity with GCP Storage principles.
- Strong skills in vector databases and embedding models to support scalable RAG systems. Proficient in optimizing retrieval and indexing for LLM integration.
- Strong experience in managing structured, semi‑structured and unstructured data in Databricks.
- Ability to inspect existing data pipelines, discern their purpose and functionality, and re‑implement them efficiently in Databricks.
- Advanced SQL skills with the ability to write and optimize complex…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).