InterviewStack.io LogoInterviewStack.io
Browse more Data Engineer jobs

Sr Advanced AI Data Engineer

Honeywell

Monterrey, NLE, Mexico1 month ago
21 views4 saves3 applies

Prepare for this role


Benefits

Remote Work

Job Type

full time

Description

As a Senior Advanced Data Engineer here at Honeywell, you will play a crucial role in designing, developing, and maintaining advanced data solutions that drive business insights and support decision-making processes. You will leverage your expertise in data engineering to build scalable data pipelines, optimize data storage, and ensure data quality and integrity.

Your ability to work with cross-functional teams and translate business requirements into technical solutions will be key to your success in this role.

In this role, you will impact the business by enabling data-driven decision-making, optimizing data processes, and improving overall data management. Your work will contribute to increased operational efficiency, cost savings, and enhanced customer satisfaction.

At Honeywell, our people leaders play a critical role in developing and supporting our employees to help them perform at their best and drive change across the company. Help to build a strong, diverse team by recruiting talent, identifying, and developing successors, driving retention and engagement, and fostering an inclusive culture.

AI-Ready Data Platform

  • Design and implement end-to-end ingestion pipelines from heterogeneous sources: including Snowflake, SQL Server, Excel, REST APIs, and unstructured data: into Azure Databricks
  • Architect and enforce Medallion Architecture (Bronze → Silver → Gold) ensuring data arrives clean, validated, and fit for purpose at each layer
  • Build Delta Live Tables (DLT) pipelines with declarative data quality expectations, schema evolution, and automated lineage tracking
  • Implement incremental loading patterns using CDC (Change Data Capture), watermarking, and Delta Lake MERGE/UPSERT for efficient, scalable ingestion
  • Enable structured and unstructured data processing: documents, Excel files, JSON, Parquet : building the foundation for AI and ML consumption

Data Modeling & Semantic Layer

  • Design and implement the Engineering data model: dimensional models, fact/dimension tables, and domain-specific data marts: serving analytics, BI, ML and AI use cases
  • Build a governed, reusable semantic layer on top of the Gold layer, enabling self-service analytics through Power BI and GCP-connected consumers
  • Ensure data models are documented, versioned, and aligned to business domains within the VECE COE

Orchestration and Data Ops

  • Build and manage Databricks Workflows with multi-task dependencies, SLA monitoring, retry logic, and alerting
  • Implement CI/CD pipelines for Databricks using Azure DevOps and GitHub Actions : including Python Wheel packaging for reusable utility libraries deployed across the platform
  • Apply software engineering best practices: version control, unit testing, modular code design, and automated deployment to Dev/QA/Prod environments
  • Cluster right-sizing, DBU management, Delta table optimization (VACUUM, compaction), cost monitoring across Azure Databricks and GCP

Data Governance & Quality

  • Implement and manage Unity Catalog for centralized data governance: three-level namespace (catalog → schema → table), fine-grained RBAC, data masking, and audit logging
  • Build data quality frameworks: rule-based validation, deduplication, reconciliation, and anomaly detection: ensuring data arrives fit for AI/ML consumption
  • Establish data lineage tracking across ingestion, transformation, and serving layers
  • Govern data delivery to GCP: ensuring secure, validated, schema-consistent outputs consumed by downstream data science and analytics teams

AI & Proactive Analytics Foundation

  • Design pipelines that are AI-ready from day one: supporting structured ML feature pipelines, embedding generation, and future Vector DB integrations
  • Build the data infrastructure that enables the shift from descriptive dashboards to proactive, predictive analytics
  • Collaborate with Data Scientists and Analytics Engineers to ensure the Gold layer supports model training, feature stores, and real-time inference pipelines

YOU MUST HAVE

  • Databricks: 4+ years hands-on: PySpark, Delta Lake, Workflows, Unity Catalog.
  • Demonstrate expertise in data strategy, for example: Medallion Architecture, Domain Data Modeling and Functional Data Architecture.
  • Data Quality Frameworks (i.e. rule-based validation, anomaly detection)
  • Data Pipelines: incremental loading, CDC, CI/CD, Observability
  • Advanced Python/Pyspark and Advanced SQL
  • Strongly preferred: DLT, UC, GCP, Azure, Kafka.
  • Highly value Databricks Certified Professional
  • 7+ years of overall data engineering experience
  • 4+ years of hands-on Azure Databricks experience in production environments
  • Proven experience building platforms, not just maintaining them: greenfield builds, migrations, framework development
  • Experience with financial, engineering, enterprise, or industrial-scale datasets preferred
  • Demonstrated ability to own technical decisions end-to-end: from architecture to production deployment

 

#LI-Hybrid

This job is found at InterviewStack.io

Skills

data pipelinessnowflakesqlexcelrest apisazuredatabricksdelta lakeparquetdata modelinganalyticspower bimonitoringci/cdazure devopsgithub actionspythonunit testinggcprbacdashboardspysparkobservabilitydata sciencedata governancedata qualitymodel trainingdata driven decision makingdata architecturedata lineage

About Honeywell

Honeywell is a diversified technology and manufacturing company, serving customers worldwide with aerospace products and services, building technologies, performance materials and supply chain automation solutions.

aerospace, manufacturingWebsite