Senior NLP Data Engineer
Listed on 2025-12-23
-
IT/Tech
AI Engineer, Data Engineer, Machine Learning/ ML Engineer, Data Scientist
Senior NLP Data Engineer, iManage
We offer a flexible working policy that supports a healthy balance between personal and professional well‑being. This role requires in‑office presence on Tuesdays & Thursdays to collaborate, connect, and learn from peers while also maintaining the flexibility for meaningful work‑life balance.
Being a Senior NLP Data Engineer at iManage means you’re passionate about transforming unstructured text into meaningful insights that power AI and machine learning solutions. You thrive at the intersection of data engineering, AI and natural language processing, building the pipelines and datasets that fuel generative AI applications, agentic systems, advanced model fine‑tuning and other NLP‑driven capabilities across iManage.
As an NLP Data Engineer on the Applied AI team, you will design, build, and optimize large‑scale text data pipelines that power AI/ML and generative AI solutions for our customers.
You’ll work with knowledge engineering, applied AI, and product teams to prepare, enrich, and integrate document data.
Your work will be essential to enabling intelligent, AI‑powered features across the iManage platform.
Responsibilities- Design, develop and maintain scalable pipelines in Microsoft Azure to ingest and transform large volumes of text data from multiple sources.
- Design automated workflows for text normalization, deduplication, language identification, PII redaction and metadata enrichment.
- Build automated data validation processes to ensure accuracy and consistency.
- Support model fine‑tuning, semantic search and generative AI evaluation tuning through dataset curation, prompt dataset preparation, labeling coordination, and text quality validation.
- Partner with the Applied AI team to gather data requirements and build data interfaces for developing and maintaining machine learning systems.
- Maintain data lineage and follow data privacy, security and governance best practices.
- Implement data versioning and lineage tracking for machine learning experiments.
- • Bachelor’s degree or higher in Computer Science, Data Engineering, Applied Mathematics, Computational Linguistics, or a quantitative related field.
- • 4+ years of data engineering experience, with at least 2 years working with unstructured data in a business setting.
- • Strong proficiency in Python, PySpark, and data manipulation for large unstructured text datasets.
- • Strong understanding of NLP concepts such as tokenization, embeddings, semantic search, and experience with standard text libraries such as Spa Cy, Hugging Face Datasets, and NLTK.
- • Solid data
Ops knowledge and experience orchestrating advanced NLP data pipelines using cloud‑based data infrastructure. - • Proficiency with Git and collaborative development frameworks.
- • A passion for enabling AI capabilities through scalable, reliable data architecture.
- • Problem‑solving, creativity, curiosity, and a collaborative mindset.
- Exposure to Microsoft Azure Services such as Fabric, ADLS, AI Foundry, Azure ML, and MLflow.
- Experience with knowledge graph implementation for NLP applications.
- Experience working with data for the legal domain.
- Experience designing architectures for large‑scale text corpora.
- Market competitive salary (US annual base range: $120,000–$170,000).
- Annual performance‑based bonus.
- Comprehensive Health, Vision, Dental, and Life insurance.
- 401(k) retirement savings plan with company match up to 4%.
- Health Joy healthcare concierge service.
- Enhanced leave for expecting parents: 20 weeks 100% paid primary leave, 10 weeks 100% paid secondary leave.
- Flexible time‑off policy and multiple company wellness days.
- Free access to the Healthy Minds app for mindfulness and meditation.
iManage provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).