Applied Researcher II; AI Foundations
Listed on 2025-12-03
-
IT/Tech
Data Scientist, AI Engineer, Machine Learning/ ML Engineer, Artificial Intelligence
Overview
At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good. We are leading the industry in using machine learning to create real-time, intelligent, automated customer experiences. Our applications of AI and ML bring humanity and simplicity to banking. We are building world-class applied science and engineering teams and scalable, high-performance AI infrastructure. You will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses.
TeamDescription
The AI Foundations team brings our AI vision to life, touching every aspect of the research lifecycle, from partnering with academia to building production systems. We work with product, technology, and business leaders to apply state-of-the-art AI to our business.
In this role, you will:
- Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money.
- Leverage a broad stack of technologies - PyTorch, AWS Ultra clusters, Hugging Face, Lightning, Vector
DBs, and more - to reveal insights hidden within large volumes of numeric and textual data. - Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation.
- Engage in high impact applied research to push the latest AI developments into the next generation of customer experiences.
- Translate the complexity of your work into tangible business goals.
- You love the process of analyzing and creating, and strive to make decisions that are right for customers.
- Innovative. You continually research and evaluate emerging technologies and stay current on state-of-the-art methods, technologies, and applications.
- Creative. You thrive on defining big problems, asking questions, and sharing new ideas.
- A leader. You challenge conventional thinking and work with stakeholders to improve the status quo and develop talent.
- Technical. You are comfortable with open-source languages and have hands-on experience developing AI foundation models and solutions using open-source tools and cloud platforms.
- Has a deep understanding of the foundations of AI methodologies.
- Experience building large deep learning models, including language, images, events, or graphs, and expertise in training optimization, self-supervised learning, robustness, explainability, or RLHF.
- An engineering mindset with a track record of delivering models at scale in terms of training data and inference volumes.
- Experience delivering libraries, platform-level, or solution-level code to existing products.
- A track record of new ideas or improvements in machine learning, evidenced by publications or notable projects.
- Ability to own and pursue a research agenda, including choosing impactful problems and carrying out long-running projects.
- Currently has, or is in the process of obtaining, a PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields (degree must be obtained on or before start date plus 2 years of experience in Applied Research) OR an MS in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research.
- PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
- LLM
- PhD focus on NLP or Masters with 5 years of industrial NLP research experience
- Multiple publications on topics related to pre-training of large language models
- Member of team that has trained a large language model from scratch (10B+ parameters, 500B+ tokens)
- Publications in deep learning theory
- Publications at ACL, NAACL and EMNLP, NeurIPS, ICML or ICLR
- Behavioral Models
- PhD focus on geometric deep learning (Graph Neural Networks, Sequential Models, Multivariate Time Series)
- Publications on training models on graph and sequential data at KDD, ICML, NeurIPS, ICLR
- Worked on scaling graph models to greater than 50M nodes
- Experience with large-scale…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).