Job description
We are seeking an AI Data Engineer to build scalable data platforms that power analytics, machine learning, and Generative AI (LLM/RAG) use cases. This role combines data engineering, cloud, and AI/ML capabilities to enable intelligent data pipelines, agentic workflows, and real-time data processing.
Job Responsibilities
- Design and build scalable ETL/ELT pipelines using Python, SQL, Airflow, DBT, and Spark.
- Develop data platforms on AWS (S3, Glue, EMR, Lambda, SQS, EventBridge).
- Build and optimize RAG pipelines (embeddings, vector DBs like FAISS/Pinecone).
- Enable LLM-based and agentic workflows (LangChain, CrewAI, AutoGen).
- Implement event-driven and real-time data pipelines.
- Design data lake/lakehouse architectures (Iceberg/Delta Lake).
- Ensure data quality, lineage, and observability (OpenMetadata or similar).
- Support ML pipelines, feature engineering, and model retraining workflows.
- Implement CI/CD and containerized deployments (Docker).
- Optimize and productionize existing data workflows.
- 4–9 years of experience in Data Engineering / AI Data Engineering
- Strong Python (Pandas, NumPy) and advanced SQL
- Hands-on with Airflow, DBT, Spark (EMR/Glue)
- Experience with AWS data stack (S3, Glue, Lambda, EMR, etc.)
- Understanding of LLMs, embeddings, and RAG architectures
- Experience with vector databases (FAISS, Pinecone, etc.)
- Knowledge of data lakes/lakehouse (Iceberg/Delta)
- Experience with relational/analytical DBs (Snowflake, Oracle, SQL Server)
- Familiarity with CI/CD, Docker, Infrastructure-as-Code, and DevOps practices and automation tools.
- Experience with Trino/Presto
- Exposure to OpenMetadata or data governance tools
- AWS certifications
- Experience in real-time/streaming pipelines
- Exposure to product engineering environments
At Gresham, we are committed to building a diverse and inclusive workforce that reflects the communities we serve. We actively encourage applications from individuals of all backgrounds and are dedicated to providing a workplace where everyone feels valued, respected and supported.
We make employment decisions based on merit, skills and potential, and do not discriminate based on any protected characteristic. We are also committed to making reasonable adjustments throughout the recruitment process and employment lifecycle.