Job Type

full time

Description

Change the world. Love your job.

About the Job
As a Data Engineer on the Smart Manufacturing and Automation team, you will design and implement data models that transform raw semiconductor manufacturing data into analytics-ready assets. Using modern lakehouse technologies (Spark, Apache Iceberg, Delta Lake, and OLAP SQL Engines), you will build ETL pipelines that integrate data from shop floor systems spanning ISA-95 Levels 0-4, processing Petabytes of data.
This role requires understanding of manufacturing data standards including ISA-95 and SEMI standards such as E90 (Substrate Tracking), E10 (Equipment Reliability, Availability, Maintainability, and Utilization), E116, etc. You will translate these domain concepts into scalable data structures that serve downstream analytics and reporting.

Key Responsibilities

Design and implement data models for manufacturing data domains including equipment performance, substrate tracking, and production metrics, electrical test, etc.
Build ETL/ELT pipelines using PySpark and SQL to load data into Iceberg and Delta Lake tables with proper historization and auditability
Develop data transformations aligned with ISA-95 hierarchy and SEMI standards
Create business vault and information marts that translate raw manufacturing data into analytics-ready data products
Implement data quality checks and expectations within pipelines to ensure accurate, reliable data
Collaborate with Manufacturing Engineering and Analytics teams to understand data requirements and deliver solutions
Document data models, transformation logic, and business rules
Optimize query performance by optimizing partitioning, clustering, and/or indexing strategies in various database technologies, Delta, and Iceberg.

Minimum Requirements

Bachelor's degree in Computer Science, Data Science, or related field
8+ years of experience in data engineering or analytics
Strong SQL and Python/PySpark for data transformations
Experience with OLAP databases such as Redshift, BigQuery, Druid, Clickhouse, StarRocks, etc.
Experience with lakehouse table formats: Apache Iceberg, Delta Lake, or similar
Understanding of ETL/ELT patterns and data pipeline development
Familiarity with Airflow or similar orchestration tools

Preferred Qualifications

Data Vault 2.0 modeling methodology (Hubs, Links, Satellites, PITs, Bridges)
Knowledge of ISA-95 hierarchy and/or SEMI standards
Data quality frameworks (Great Expectations or similar)
Experience in semiconductor or discrete manufacturing environments
Databricks experience (Delta Live Tables, Unity Catalog)

This job is found at InterviewStack.io

Skills

automationsparkapacheicebergdelta lakesqletlanalyticspysparkvaultpythonredshiftbigqueryclickhousedata pipelinesairflowdatabricksdata sciencedata qualitydata structuresdata transformation

About Solventum

Solventum is a healthcare company that spun off from 3M in 2024 and became independent. The company focuses on advancing healthcare through innovative medical devices, dental solutions, health information systems, and purification products. Operating in more than 35 countries with approximately 22,000 employees globally, Solventum is committed to improving clinical outcomes and supporting healthcare professionals.

enterprise companyhealthcare, medical_devicespublicWebsite

Senior Data Engineer

Prepare for this role

Job Type

Description

Skills

About Solventum