Data Scientist - INDIA
Prosper, Collin County, Texas, 75078, USA
Listed on 2026-04-29
-
IT/Tech
Machine Learning/ ML Engineer, AI Engineer, Data Scientist, Data Engineer
Role:
Data Scientist - INDIA
Location:
Hyderabad, INDIA
* Consultants local to INDIA are eligible.
Category:
Data Science – Structured Data / Text Data (NLP & GenAI)
We are seeking a highly skilled Data Scientist (3–7 years of experience) to join our team and work across two major data science domains:
Key Responsibilities Structured Data – Machine Learning & AnalyticsBuild, deploy, and optimize ML models for predictive analytics, forecasting, classification, and regression.
Perform large-scale feature engineering using PySpark and Big Data tools.
Work on batch pipelines, model versioning, and experiment tracking.
Develop cost estimation and risk/likelihood models using statistical and ML techniques.
Text Data / NLP / GenAIBuild NLP pipelines using deep learning frameworks such as PyTorch, Tensor Flow, or similar.
Develop real‑time, low‑latency inference systems for text classification, embeddings, semantic search, summarization, and retrieval.
Create prompts, context graphs, and agentic workflows for LLM-based systems.
Apply knowledge of prompt engineering, context engineering, and autonomous agent frameworks to production systems.
Core Data Science Engineering & MLOpsWork in Databricks for ETL, feature engineering, ML training, and orchestration.
Use Azure services for model deployment, data pipelines, and infrastructure.
Collaborate using Git-based workflows; leverage tools like Git Hub Copilot, Claude Code, etc.
Implement model monitoring, observability, drift detection, and performance tracking.
Required Skills & Experience Core Skills- Strong hands‑on experience with Databricks (Delta Lake, MLflow, Job Orchestration).
- Excellent PySpark skills for large‑scale distributed data processing.
- Proficiency in Azure cloud services (ADF, Azure ML, AKS, Databricks on Azure).
- Strong understanding of ML algorithms, statistical methods, and data analysis.
- Experience with deep learning frameworks:
PyTorch, Tensor Flow, Transformers (Hugging Face). - Experience with model monitoring and ML observability.
- Ability to write clean, optimized code and leverage AI code assistants.
- Prompt engineering (task prompts, chain of thought, tool calling, retrieval prompts).
- Context engineering (retrieval pipelines, RAG, memory management, context structuring).
- Knowledge of LLM‑based agentic frameworks (Lang Chain, Semantic Kernel, CrewAI, Auto Gen, etc.).
- Experience with vector databases and embedding models is a plus.
- Experience with containerization (Docker, Kubernetes, AKS).
- Experience deploying models to production (REST APIs, real‑time endpoints).
- Knowledge of streaming technologies (Kafka, Event Hub, Spark Streaming).
- Understanding of CI/CD for ML (Azure Dev Ops / Git Hub Actions).
A problem solver who is comfortable working with both structured and unstructured data.
Someone who enjoys using modern AI tools to accelerate development.
A data scientist who writes clean, production‑grade code.
A collaborator who thrives in cross‑functional teams and fast‑paced environments.
Flexible work from home options available.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).