Data Engineer - Bilingual Mandarin Job Norco area,California USA,IT/Tech

Position: Data Engineer - Bilingual Mandarin required
CWILL (pronounced "quill") is the post-purchase and retention suite built for Shopify.

With strong product-market fit and expanding US operations, we're building out our security and compliance capabilities to meet global data privacy standards.

Learn more:

I. Basic Information

Work Authorization

Green Card / U.S. Citizen required (we do nor sponsor)

Job Title

Data Engineer

Focus Areas

Data ingestion, data lakehouse, data warehouse, data platform, data service APIs, data quality & engineering agent development

Level

Junior to mid-level with high growth potential

Location

United States - on-site, remote, or hybrid (per company requirements)

Employment Type

Full-time

Collaborating Teams

CWILL Data Engineering, Data Analytics, Business, Product, and Technology teams

Language

English required;
Mandarin is a strong plus

Cross-Timezone Work

Must maintain a regular collaboration window with the China team; strong async communication and documentation skills required (approx. 2 hrs/day overlap needed)

Collaboration Frequency

Every 1-2 days; approx. 2 hrs per session. Candidates in western US time zones preferred for scheduling.

II. Role Positioning

CWILL is building data infrastructure to support business operations, product capabilities, customer service, analytics, and intelligent applications. As a US-side data engineer, you will participate in multi-source data ingestion, data lakehouse and warehouse development, data quality governance, data platform capability building, and AI Agent engineering automation exploration.

We are looking for candidates with a solid foundation in SQL, Python, and data engineering - someone who can, with guidance from the existing data team, progressively take ownership of data ingestion, modeling, quality, and service tasks, while collaborating effectively with domestic data engineering, analytics, and business teams.

This is not a pure data analysis, BI reporting, or one-off scripting role. It is a comprehensive data engineering position focused on data integration, data warehouse development, data platform capabilities, data services, and engineering automation.

III. Role Mission

Through stable, well-structured, and scalable data engineering capabilities, help the company unify, govern, model, and serve data scattered across business systems, SaaS platforms, external channels, and internal systems - improving the usability, accuracy, timeliness, and reusability of CWILL's data assets.

This role is expected to continuously drive:

• More standardized data source ingestion

• Clearer data lakehouse and warehouse structure

• More automated data quality monitoring

• More platform-driven data service capabilities

• Progressive adoption of agent-based and automated approaches for data development, troubleshooting, documentation, and quality checks

IV. Key Responsibilities

1. Data Ingestion & Pipeline Development

• Ingest data from internal and external business systems, third-party platforms, SaaS products, and external data sources; handle data collection, sync, cleansing, and loading

• Participate in building offline and real-time data pipelines using Sea Tunnel, Kafka, Flink, Spark, or similar technologies to improve ingestion stability and processing efficiency

• Handle practical challenges in data sync: authentication, pagination, rate limiting, failure retry, incremental sync, backfill, schema changes, and task anomalies

2. Data Warehouse & Data Modeling

• Participate in layered data warehouse development across ODS, DWD, DWS, and ADS layers; build and maintain data models

• Support business domain modeling, metric standardization, shared data model development, and core table maintenance

• Optimize data organization and query performance on OLAP engines such as Doris to provide stable data support for product, operations, growth, customer success, and management analytics

3. Data Quality & Data Governance

• Build and maintain data quality rules for core data pipelines; ensure data accuracy, completeness, consistency, and timeliness

• Participate in data validation, anomaly detection, alerting, and issue resolution; help improve stability of critical data pipelines

• Contribute to data governance capabilities…