Software Engineer (AI)
Gresham
Gurugram, India2 months ago
63 views32 saves8 applies
Prepare for this role
Job Type
full time
Description
We are seeking an AI Data Engineer to build scalable data platforms that power analytics, machine learning, and Generative AI (LLM/RAG) use cases. This role combines data engineering, cloud, and AI/ML capabilities to enable intelligent data pipelines, agentic workflows, and real-time data processing.
Job Responsibilities
At Gresham, we are committed to building a diverse and inclusive workforce that reflects the communities we serve. We actively encourage applications from individuals of all backgrounds and are dedicated to providing a workplace where everyone feels valued, respected and supported.
We make employment decisions based on merit, skills and potential, and do not discriminate based on any protected characteristic. We are also committed to making reasonable adjustments throughout the recruitment process and employment lifecycle.
Job Responsibilities
- Design and build scalable ETL/ELT pipelines using Python, SQL, Airflow, DBT, and Spark.
- Develop data platforms on AWS (S3, Glue, EMR, Lambda, SQS, EventBridge).
- Build and optimize RAG pipelines (embeddings, vector DBs like FAISS/Pinecone).
- Enable LLM-based and agentic workflows (LangChain, CrewAI, AutoGen).
- Implement event-driven and real-time data pipelines.
- Design data lake/lakehouse architectures (Iceberg/Delta Lake).
- Ensure data quality, lineage, and observability (OpenMetadata or similar).
- Support ML pipelines, feature engineering, and model retraining workflows.
- Implement CI/CD and containerized deployments (Docker).
- Optimize and productionize existing data workflows.
- 4–9 years of experience in Data Engineering / AI Data Engineering
- Strong Python (Pandas, NumPy) and advanced SQL
- Hands-on with Airflow, DBT, Spark (EMR/Glue)
- Experience with AWS data stack (S3, Glue, Lambda, EMR, etc.)
- Understanding of LLMs, embeddings, and RAG architectures
- Experience with vector databases (FAISS, Pinecone, etc.)
- Knowledge of data lakes/lakehouse (Iceberg/Delta)
- Experience with relational/analytical DBs (Snowflake, Oracle, SQL Server)
- Familiarity with CI/CD, Docker, Infrastructure-as-Code, and DevOps practices and automation tools.
- Experience with Trino/Presto
- Exposure to OpenMetadata or data governance tools
- AWS certifications
- Experience in real-time/streaming pipelines
- Exposure to product engineering environments
At Gresham, we are committed to building a diverse and inclusive workforce that reflects the communities we serve. We actively encourage applications from individuals of all backgrounds and are dedicated to providing a workplace where everyone feels valued, respected and supported.
We make employment decisions based on merit, skills and potential, and do not discriminate based on any protected characteristic. We are also committed to making reasonable adjustments throughout the recruitment process and employment lifecycle.
This job is found at InterviewStack.io
Skills
analyticsmachine learninggenerative aillmragdata pipelinesetlpythonsqlairflowdbtawss3lambdasqsembeddingslangchainicebergdelta lakeobservabilityci/cddockerpandasnumpysparkllmsvector databasessnowflakeautomationtrinoprestodata governancedata qualityfeature engineeringml pipelines
About Gresham
Gresham is a global leader in enterprise data automation solutions for the financial services industry.
financial services, software