Posted 3w ago

Senior Data Engineer

@ Milestone Technologies
Hyderabad, Telangana, India
OnsiteFull Time
Responsibilities:build pipelines, design architectures, develop data quality checks
Requirements Summary:5+ years building data pipelines on AWS with SQL, Python, Spark; lakehouse, Iceberg, DBT, Airflow; Kafka; Snowflake; CI/CD; strong collaboration.
Technical Tools Mentioned:SQL, Python, Apache Spark, AWS (Glue, EMR, S3, Lambda, Step Functions), Apache Iceberg, DBT, Airflow, Kafka, Snowflake, CI/CD
Save
Mark Applied
Hide Job
Report & Hide
Job Description

Milestone Technologies is seeking a skilled Data Engineer to support a client’s data engineering initiatives on AWS. This role focuses on building scalable data pipelines, improving data quality, and ensuring reliable data processing across modern lakehouse architectures.

You will work closely with data engineering teams, business analysts, and reporting teams to design, build, and optimize data pipelines while ensuring high data quality, observability, and performance.

 

  • Build scalable data ingestion pipelines for relational, semi-structured, and unstructured data sources
    Design, implement, and optimize lakehouse architectures using Apache Iceberg
  • Optimize table design including partitioning, compaction, schema evolution, and performance tuning for Iceberg datasets
  • Implement best practices for versioning, time travel, incremental processing, and ACID compliance
  • Develop and optimize Apache Spark (batch and streaming) jobs for large-scale data processing
  • Work extensively with AWS services such as Glue, EMR, Lambda, Step Functions, and S3 with a focus on cost and performance optimization
  • Build and manage real-time data pipelines using Kafka and Kafka Streaming
  • Design and orchestrate workflows using DBT and Airflow
  • Implement automated data quality checks, validation frameworks, and error monitoring mechanisms
  • Establish observability frameworks including monitoring, logging, and alerting for data pipelines
  • Collaborate with analytics/reporting teams to enable data quality dashboards and reporting
  • Analyze existing pipelines to identify improvements and enhance reliability and scalability
  • Leverage AI/LLM-based tools to accelerate ETL/ELT development, validation, and debugging
  • Participate in code reviews and contribute to best practices and engineering standards

 

Skills

  • Bachelor’s degree (or higher) in Computer Science, Engineering, or a related technical field
  • 5+ years of experience designing, building, and maintaining data pipelines
  • Strong programming skills in SQL, Python, and Apache Spark
  • Hands-on experience with AWS data services (Glue, EMR, S3, Lambda, Step Functions)
  • Deep understanding of lakehouse architectures and Apache Iceberg
  • Experience with DBT and Airflow for data transformation and orchestration
  • Strong experience with Kafka and real-time streaming pipelines
  • Experience working with Snowflake as a cloud data warehouse
  • Strong understanding of data quality frameworks, validation, and monitoring
  • Experience handling structured, semi-structured, and unstructured data at scale
  • Solid understanding of distributed systems and data engineering best practices
  • Experience with CI/CD pipelines and automation (preferred)
  • Strong problem-solving skills and ability to work in a fast-paced environment
  • Excellent communication skills and ability to collaborate with cross-functional teams