Data Engineer II - Azure Data Factory & ETL Ops; Remote Job Lowell Massachusetts USA,IT/Tech

Position: Data Engineer II - Azure Data Factory & ETL Ops (Remote)

For more than 170 years, The Hanover has been committed to delivering on our promises and being there when it matters the most. We live our values every day, demonstrating we CARE through our values, Sustainability initiatives, and inclusive corporate culture.

Our Personal Lines Operations team is currently seeking a Data Engineer II in our Worcester office. This is a full-time, exempt role. Fully remote arrangement will be considered for candidates with strong qualifications.

POSITION OVERVIEW

The Hanover Insurance Group is hiring a Personal Lines Operations Data Engineer to own the day-to-day reliability and enhancement of our Azure Data Factory (ADF) pipelines and on-prem SQL Server data environment. This role builds and improves ETL workflows that populate SQL tables used by analysts to create Power BI semantic models and reports. The position is split approximately 50/50 between production operations/support and new development/enhancement, while also advancing platform maturity through improved monitoring/alerting, data quality validation, and formalized release practices.

WHAT

YOU’LL OWN

Production support and daily health of scheduled ADF pipelines.
Build new pipelines and enhance existing pipelines to improve resiliency, maintainability, and scalability.
Implement data validation controls, improve monitoring/alerts, and help define SLAs for data freshness and availability.
Establish foundational SDLC practices for data engineering (Git usage, Dev to Prod promotion practices, and a more formal release process).
Coordinate cross-team dependencies where upstream internal ETL timelines affect downstream pipeline completion; design dependency-aware orchestration and readiness checks.
Contribute to future-state Azure data strategy recommendations (e.g., Data Lake/Blob Storage, notebooks) and support long-term migration planning from on-prem SQL Server to cloud databases (timeline not yet defined).

KEY RESPONSIBILITIES

Azure/Fabric Data Factory Engineering
- Design and maintain curated SQL tables/views used for analytics and reporting; optimize for refresh performance and downstream usability.
- Develop and maintain ADF pipelines (Copy activities and Data Flows) with consistent patterns for logging, error handling, retries, and notifications.
- Implement parameterization and reusable components to reduce duplication and speed enhancements.
- Implement incremental load and backfill strategies appropriate to volume.
Production Support & Reliability
- Monitor daily pipeline execution and triage incidents quickly to restore successful processing.
- Perform root‑cause analysis for failures and recurring issues; implement preventative controls and standardized patterns.
- Create and maintain operational documentation/runbooks for critical pipelines and common support scenarios.
Data Quality, Validation & Trust
- Build automated validation routines and reconciliation checks (row counts, totals, null/duplicate thresholds, schema drift detection, anomaly flags).
- Partner with analysts and stakeholders to define key business rules and quality thresholds for trusted reporting datasets.
- Document data definitions, transformations, and lineage to improve transparency and troubleshooting.
Stakeholder Collaboration & Dependency Management
- Work directly with internal Operations Data Analysts, business stakeholders and external partner analysts to gather requirements and deliver datasets that enable robust Power BI semantic models, reporting and other analytical solutions.
- Coordinate with upstream internal teams to align dependency readiness signals and timelines; implement orchestration controls to prevent downstream failures.
- Provide technical coaching and mentoring to less experienced team members; including the usage of best practices.

KEY MEASURES OF SUCCESS

Pipelines meeting SLA / on-time delivery
Pipeline success rate and reduced manual intervention
Time-to-detect/time-to-resolve ETL failures (MTTD/MTTR)
Improvements in query/runtime performance and cost efficiency
Reduction in recurring data quality defects; completeness/accuracy checks passing
Documentation coverage (runbooks for critical pipelines, data dictionary completeness)

REQUIRED…