Senior Data Engineer Job Riyadh area,Riyadh Region Saudi Arabia,Software Development

About Intelmatix Intelmatix is a deep tech Artificial intelligence (AI) company founded in July 2021 by a group of MIT scientists with the vision of transforming enterprises to become cognitive. A cognitive enterprise uses AI and Decision Intelligence in making decisions. This leads to better business decisions with improved accuracy, reduced errors, and better outcomes across various aspects of the business.

We are seeking a highly experienced Senior Data Engineer to lead the technical design, implementation, and delivery of an enterprise‑grade, AI‑ready Data Lakehouse. This role is critical to establishing the foundational data layer for a massive digital transformation initiative, designed to power advanced Artificial Intelligence (AI) agents, digital workers, and complex knowledge graphs (Ontologies).

The ideal candidate will have a strong background in software development with a focus on building and optimizing data pipelines, ensuring data quality, and integrating data from various sources. As a Senior Data Engineer, you will play a key role in designing, developing, and maintaining scalable data infrastructure that supports our business intelligence and analytics efforts.

Based in Riyadh, you will be responsible for architecting a highly secure platform that complies with strict national data residency and cybersecurity standards. You will not only build the system but also act as a technical leader, mentoring client engineering teams through a collaborative "co‑build" model to ensure long‑term operational ownership.

Key Responsibilities

Lakehouse Architecture & Implementation:
Design and deploy a unified Data Lakehouse utilizing the Medallion architecture (Bronze, Silver, Gold) and open table formats (e.g., Delta Lake, Apache Iceberg) on cloud infrastructure hosted within Saudi Arabia.
Data Ingestion & Pipeline Engineering:
Build reusable, automated ingestion frameworks (batch and streaming) capable of processing both structured data (RDBMS, APIs) and unstructured data (PDFs, policy documents) to feed downstream AI models and semantic reasoning engines.
Data Quality & Governance:
Implement automated data quality "circuit breakers" (completeness, uniqueness, referential integrity) and end‑to‑end data lineage tracking frameworks.
Optimization:
Optimize data processing workflows for performance, scalability, and cost‑efficiency.
System Monitoring and Maintenance:
Monitor and maintain data systems, responding to SEVs or other urgent issues to ensure continuous operations.
Security & Compliance:
Ensure the platform adheres strictly to NCA (National Cybersecurity Authority) and NDMO (National Data Management Office) standards. Implement AES‑256 encryption at rest, TLS 1.2+ in transit, robust Key Management Systems (KMS), and centralized audit logging.
Access Control Integration:
Design and deploy granular Role‑Based Access Control (RBAC) and Attribute‑Based Access Control (ABAC), integrating seamlessly with existing enterprise Identity Providers (e.g., Active Directory).
Capability Building & Handover:
Lead hands‑on knowledge transfer sessions, pair‑programming with client engineers, creating operational runbooks, and conducting "Game Day" failure simulations to ensure the client’s team is fully ready to operate the platform independently.

Qualifications & Experience

Education:

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Experience:

5+ years of proven experience in Data Engineering, Distributed Systems, or Big Data Architecture, with at least 2+ years specifically leading Data Lakehouse or Cloud Data Platform implementations.
Technical Skills & Core Technologies:
- Programming
  
  Languages:
  
  Proficiency in programming languages such as Python, Java, or Scala.
- Data Architecture & System Design:
  Strong expertise in designing data‑intensive applications, complex data modeling, and schema design for enterprise environments.
- Distributed Systems & Lakehouse Technologies:
  Deep, hands‑on experience with distributed processing engines (e.g., Apache Spark, Kafka, Hadoop) and modern open table formats (e.g., Delta Lake, Apache Iceberg, Apache Hudi).
- ETL/ELT & Orchestration:
  Experience…