Hiring – 3 Month Contract (Extendable)
Job Title – Data Quality Engineer (Databricks).
Salary – AED 35,000 - 38,000 Per Month.
Contract Length – 3 Months – Extendable.
Start Date – ASAP.
Role OverviewThe Data Quality Engineer will be responsible for designing, implementing, and operating the client's enterprise data quality framework within the Databricks platform. The role will deliver automated profiling, quality rule execution, cleansing, monitoring, remediation support, and quality reporting capabilities across 170 datasets and 1,346 prioritised Critical Data Elements (CDEs).
Working closely with Data Modellers, Data Catalogue Specialists, business data owners, and platform engineers, the Data Quality Engineer will establish scalable and reusable quality controls that improve trust, accuracy, completeness, consistency, timeliness, validity, and uniqueness across the client's data estate.
Key Responsibilities Databricks Platform Configuration and Administration- Configure and manage the Databricks environment supporting enterprise data quality operations.
- Establish and maintain:
- Compute clusters.
- PySpark notebook frameworks.
- Unity Catalog integration.
- Optimise platform performance for large-scale profiling and rule execution across all in-scope datasets and CDEs.
- Implement development, testing, and production deployment standards for data quality assets.
- Design and develop AI-assisted profiling notebooks using PySpark.
- Perform baseline data quality assessments across the six quality dimensions:
- Accuracy.
- Consistency.
- Validity.
- Timeliness.
- Capture and analyse:
- Null value rates.
- Duplicate records.
- Invalid values.
- Format violations.
- Schema drift.
- Produce quality profiling outputs for all prioritised CDEs and datasets.
- Design and implement a reusable Data Quality Rule Factory.
- Build parameterised PySpark-based rule templates capable of supporting large-scale rule deployment.
- Enable automated generation and management of approximately 6,730 data quality rules without manual rule‑by‑rule development.
- Ensure rules are reusable, configurable, and maintainable across multiple datasets and domains.
- Deploy quality rules as reusable Databricks Jobs integrated into Delta Lake processing pipelines.
- Embed quality controls within bronze, silver, and gold processing stages.
- Implement automated quality gates preventing data progression where defined thresholds are not met.
- Maintain rule traceability and execution history for audit and governance purposes.
- Develop automated remediation and cleansing pipelines using PySpark.
- Implement:
- Standardisation routines.
- Deploy machine learning models managed through MLflow for:
- Anomaly detection.
- Fuzzy matching and duplicate identification.
- Ensure all AI and ML recommendations are explainable, auditable, and routed through human‑in‑the‑loop validation processes where required.
- Design and manage exception handling processes for failed quality records.
- Implement quarantine Delta Lake tables serving as the Failed Record Register.
- Capture and maintain:
- Failure reason.
- Associated CDE.
- Rule reference.
- Processing timestamp.
- Develop reprocessing workflows to support correction and controlled re‑ingestion of remediated records.
- Develop Delta Lake metric aggregation structures supporting enterprise quality reporting.
- Calculate and publish:
- Data Quality Index (DQI) scores.
- Dimension‑level quality metrics.
- Dataset compliance scores.
- SLA adherence indicators.
- Provide curated outputs to support Power BI quality dashboards and executive reporting.
- Configure automated quality monitoring and alerting mechanisms.
- Implement threshold‑based notifications using:
- Databricks SQL Alerts.
- Develop predictive risk scoring models to identify datasets at risk of future quality degradation.
- Support proactive quality management and operational intervention activities.
- Apply Databricks machine learning and pattern analysis techniques to…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).