×
Register Here to Apply for Jobs or Post Jobs. X

Principal Research Engineer

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: worldcoin.org
Full Time position
Listed on 2026-06-09
Job specializations:
  • IT/Tech
    Data Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Staff/Principal Research Engineer

About the Opportunity:

You will join a high-impact team that maintains and evolves the data platform powering our AI pipelines. This is an all-rounder role that combines backend development, data engineering, infrastructure, and lightweight frontend work.

Your work will span the ingestion layer, transformation workflows, and the warehouse itself: designing resilient pipelines, building secure APIs, and creating services that make our datasets reliable, discoverable, and ready for large-scale training. You will also play a key role in rearchitecting existing systems into generic, reusable components, moving away from point solutions toward a data collection platform that can serve multiple programs without being rebuilt each time.

You will be a key contributor to the infrastructure that feeds and monitors our machine learning models in production: ensuring that data flows seamlessly, services run reliably, and governance standards are never compromised. Every solution you build will follow the highest security standards and rigorous data governance principles, ensuring sensitive biometric data is handled with absolute care.

This role is onsite 5 days/week and sits in our Munich office.

Key Responsibilities:
  • Design and operate automated data quality pipelines with human-in-the-loop review stages, combining automated checks with structured labeling workflows to determine whether ingested data meets acceptance criteria
  • Develop and refine transformation processes to deliver clean, well-structured datasets ready for analytics, model training, and evaluation: production-grade, with strong schema contracts
  • Instrument systems with metrics, alerts, and recovery mechanisms, and build internal tooling and dashboards that make dataset health, pipeline state, and operational metrics visible to engineers and stakeholders
  • Build APIs and backend services that provide secure, performant access to large datasets while upholding strict governance and privacy controls
  • Raise engineering standards by improving CI/CD pipelines, integration tests, and dependency management
  • Own the lifecycle of critical data assets, including lineage tracking, access control, and schema enforcement

You will work with both structured and semi-structured data, combining SQL-based platforms like Snowflake with No

SQL sources like Mongo

DB. You'll build resilient pipelines that handle versioning, schema evolution, and are GDPR compliant.

About You:
  • 5+ years of proficiency in Python, with experience building production services
  • Strong system design fundamentals, with experience evolving existing systems toward more generic, reusable designs
  • Experience designing and building APIs with security and performance requirements
  • Comfortable with containerization and orchestration tools like Docker and Kubernetes
  • Experienced with AWS services (S3, KMS, IAM) and Terraform for infrastructure as code
  • Skilled in designing and operating data ingestion and transformation workflows, with exposure to Snowflake or other SQL-based analytics platforms
  • Familiar with CI/CD pipelines and version control practices, ideally using Git Hub Actions or similar tools
  • Committed to building systems that are secure, observable, and follow strong data governance principles
  • Obsessed with reliability, observability, and data governance, you care deeply about logs, metrics, and traceability
  • Experience with structured logging, metrics instrumentation, and alerting
  • Strong fundamentals in data modeling, schema design, and backward-compatible schema evolution
  • Comfortable working with No

    SQL systems like Mongo

    DB, especially for building ingestion frameworks, managing schema evolution, or integrating Change Streams into ETL pipelines
  • Knowledge of data partitioning strategies and large-scale dataset optimization for analytics and ML
  • Experience with event-driven data pipelines using SQS, SNS, Lambda, or Step Functions
Nice to have:
  • Proficiency in Go
  • Familiarity with annotation or labeling workflows and tooling
  • Exposure to monitoring and alerting stacks such as Datadog or Prometheus
  • Proficiency in Rust or an interest in learning it
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary