AIML Architect Databricks AWS
Prosper, Collin County, Texas, 75078, USA
Listed on 2026-05-30
-
IT/Tech
Data Engineer, AI Engineer, Machine Learning/ ML Engineer, Data Science Manager
Job Title:
AI/ML Architect with Databricks , AWS
Location :
Los Angeles CA (Hybrid)
Hire type : FTE / CTH
Role OverviewWe are seeking an experienced AI/ML Architect with deep hands‑on expertise in Databricks on AWS to lead the design and implementation of scalable, high-performance data and machine learning platforms. The ideal candidate combines architectural thinking with strong engineering execution, demonstrating the ability to build modern lakehouse systems, optimize large-scale pipelines, and drive analytical and ML capabilities across the organization.
This role requires working with large, multi-terabyte datasets, advanced analytics, and end-to-end ML lifecycle management using Databricks, Python, PySpark, and AWS-native services.
Must Demonstrate (Critical Competencies)- Designing Databricks‑based lakehouse architectures on AWS (Delta Lake + S3 + Unity Catalog).
- Clear separation of compute vs. serving layers in distributed architectures.
- Low‑latency API strategy where Spark is insufficient (e.g., leveraging optimized services or caching).
- Caching strategies to accelerate reads and reduce compute cost.
- Data partitioning, file size tuning, and optimization strategies for large-scale pipelines.
- Experience handling multi‑terabyte structured time‑series workloads.
- Ability to distill architectural significance from ambiguous business requirements.
- Strong curiosity, questioning, and requirement‑probing mindset.
- Player‑coach approach: hands‑on technical depth + ability to guide design.
- Develop, train, and optimize ML models using Python, PySpark, MLflow, and Databricks Machine Learning.
- Conduct exploratory data analysis (EDA) to identify patterns, trends, and insights in large datasets.
- Deploy ML models into production using MLflow, Databricks Workflows, or other MLOps pipelines.
- Build analytics solutions such as forecasting, anomaly detection, segmentation, or recommendation systems.
- Design ML architectures aligned with Databricks Lakehouse on AWS.
- Architect and build scalable ETL/ELT pipelines using PySpark, SQL, and Databricks Workflows.
- Implement Delta Lake best practices, including OPTIMIZE, ZORDER, partitioning, and schema evolution.
- Design lakehouse layers (Bronze/Silver/Gold) with strong separation of compute and serving layers.
- Optimize cluster performance and jobs using Spark tuning, caching, and shuffle minimization.
- Work with multi‑terabyte, time‑series, high‑velocity data in a distributed environment.
- Ensure robust data availability for downstream ML and analytics workloads.
Architect end-to-end data and ML solutions using AWS services, including:
- S3 for storage
- IAM for identity & access
- Glue Catalog for metadata management
- Networking for secure, high‑throughput data movement
- Integrate Databricks with AWS-native compute, API layers, and low‑latency endpoints.
- Translate business problems into scalable analytical or ML architectures.
- Communicate complex statistical and architectural concepts to non‑technical stakeholders.
- Collaborate with product, engineering, and business leaders to drive data‑informed initiatives.
- Provide design leadership while remaining hands‑on in execution.
- Bachelor’s or Master’s in Computer Science, Data Science, Engineering, Statistics, or related field.
- 10+ years of experience in data engineering, ML engineering, or AI/ML architecture roles.
- Deep expertise in Databricks on AWS, including PySpark / Spark SQL, Databricks Notebooks, Delta Lake, Unity Catalog, MLflow, Databricks Jobs & Workflows.
- Strong programming ability in Python (pandas, numpy, scikit‑learn).
- Demonstrated experience with large‑scale, multi‑terabyte data processing.
- Strong understanding of ML algorithms, distributed systems, and data optimization.
- Experience with MLOps and production deployment pipelines.
- Strong grasp of AWS‑native data and compute services.
- Understanding of CI/CD using Git Hub Actions, Git Lab CI, or similar.
- Familiarity with deep learning frameworks (Tensor Flow, PyTorch).
- Strong analytical and problem‑solving skills.
- Ability to work in fast‑paced, highly collaborative environments.
- Excellent communication and presentation abilities.
- Self‑driven with exceptional attention to architectural detail.
Flexible work from home options available.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).