Data & AI Engineer Job Washington area,District of Columbia USA,Software Development

The Data & AI Engineer sits within Carlyle's Enterprise Technology & Data organization and supports firm-wide data and AI initiatives spanning investment platforms, portfolio operations, investor relations, and corporate functions. The role operates within a federated data operating model, partnering with domain engineering teams to implement shared platforms and reusable patterns for data and AI under the technical direction of the Senior AI & Data Architect.

Position

Summary

The Data & AI Engineer is an experienced, hands‑on engineer who turns Carlyle's data and AI architecture into working production systems. Reporting to the Senior AI & Data Architect, this role is responsible for building and operating the pipelines, semantic layers, retrieval systems, and AI‑ready data products that power analytics, automation, LLMs, agents, and generative AI applications across the firm.

The role requires deep, hands‑on expertise across modern data engineering and applied AI engineering. The Data & AI Engineer will implement retrieval‑augmented generation (RAG) patterns, embedding and indexing pipelines, vector stores, and semantic models alongside core ELT, streaming, and analytical pipelines – treating LLMs, agents, and copilots as first‑class consumers of the data platform.

This is a senior individual‑contributor engineering role that executes against architectural standards, contributes to their evolution through hands‑on learning, and partners closely with data science, AI engineering, governance, and domain teams to deliver trusted, AI‑consumable data at enterprise scale.

What Success Looks Like

In the first 12 months, this role will deliver foundational AI‑ready data pipelines and retrieval components defined in the target‑state architecture, product ionize one or more priority RAG or agent‑grounding use cases, and establish reusable engineering patterns that other domain teams can adopt across the federated data platform.

In‑office Requirement

4 days per week.

Location

Washington, D.C. or New York, NY.

Education & Certifications

Bachelor's degree, required
Concentration in computer science, data engineering, information systems, or a related field, preferred
Masters degree, preferred
Relevant certifications in cloud, data engineering, analytics, or AI/ML are preferred

Professional Experience

6+ years of overall relevant technical experience, required
Experience in data engineering, analytics engineering, or platform engineering, with at least 1–2 years of direct, hands‑on experience building generative AI or AI/ML systems in production.
Proven experience implementing retrieval, grounding, and semantic components for LLM‑ or agent‑based applications, including RAG pipelines, vector stores, embedding workflows, and structured tool use.
Hands‑on experience with one or more modern AI platforms and tooling categories (e.g., AWS Bedrock, Databricks ML, Snowflake Cortex, OpenAI/Anthropic APIs, Lang Chain/Llama Index or equivalents, MLflow, and vector databases such as Databricks Vector Search, pgvector, or Pinecone).
Strong, demonstrable expertise in Python and SQL, with working knowledge of distributed processing frameworks (e.g., Spark).
Deep, hands‑on experience with modern data stacks – dbt, Fivetran, Snowflake – in AWS‑based environments.
Track record of building data pipelines and products whose consumers include AI systems, not only BI tools and human analysts.
Palantir experience a plus.
Experience operating within federated data operating models and complex, regulated enterprise environments; financial services experience preferred.

Competencies & Attributes

Demonstrated AI‑forward instinct: defaults to asking how AI changes what gets built, rather than whether AI can be added later.
Fluency in current AI engineering patterns (RAG, agents, tool use, evaluations, guardrails, observability) and the practical trade‑offs involved in shipping them.
Strong engineering craft: clean code, automated testing, thoughtful design, and a bias toward production‑quality systems over prototypes.
Pragmatic, delivery‑oriented mindset with strong attention to data quality, AI trust, and long‑term maintainability; able to distinguish…