Job Title: PySpark Data Engineer
Experience: 6+ Years
Location: Hyderabad/ Pune
Employment Type: Full-Time

Job Summary:

We are looking for a skilled and experienced PySpark Data Engineer to join our growing data engineering team. The ideal candidate will have 6+ years of experience in designing and implementing data pipelines using PySpark, AWS Glue, and Apache Airflow, with strong proficiency in SQL. You will be responsible for building scalable data processing solutions, optimizing data workflows, and collaborating with cross-functional teams to deliver high-quality data assets.

Requirements

Key Responsibilities:

Design, develop, and maintain large-scale ETL pipelines using PySpark and AWS Glue.
Orchestrate and schedule data workflows using Apache Airflow.
Optimize data processing jobs for performance and cost-efficiency.
Work with large datasets from various sources, ensuring data quality and consistency.
Collaborate with Data Scientists, Analysts, and other Engineers to understand data requirements and deliver solutions.
Write efficient, reusable, and well-documented code following best practices.
Monitor data pipeline health and performance; resolve data-related issues proactively.
Participate in code reviews, architecture discussions, and performance tuning.

Requirements

6+ years of experience in data engineering roles.
Strong expertise in PySpark for distributed data processing.
Hands-on experience with AWS Glue and other AWS data services (S3, Athena, Lambda, etc.).
Experience with Apache Airflow for workflow orchestration.
Strong proficiency in SQL for data extraction, transformation, and analysis.
Familiarity with data modeling concepts and data lake/data warehouse architectures.
Experience with version control systems (e.g., Git) and CI/CD processes.
Ability to write clean, scalable, and production-grade code.

Benefits

As per company standards.

Lead/ Sr. Pyspark Data Engineer

Requirements

Requirements

Benefits