*Please note, this role is not able to offer visa transfer or sponsorship now or in the future*
Role Tile: Senior AWS Data Engineer
About the role
Design, build, and operate scalable, cloud‑native data platforms supporting batch and streaming use cases, with strong focus on governance, performance, and reliability.
Responsibilities
- Programming
· Python: Strong hands-on experience with Python for data engineering tasks, including scripting, automation, and ETL logic development
· PySpark: Proficiency in writing and optimizing PySpark jobs for large-scale data transformations
· SQL: Advanced SQL skills for data querying, transformation logic, and stored procedure conversion from SQL Server
- Big Data Processing Frameworks
· Apache Spark: Strong experience with Spark core concepts — RDDs, DataFrames, Datasets, partitioning, and performance tuning
· Data partitioning and optimization: Experience with data skew handling, broadcast joins, caching strategies, and Spark tuning
· AWS Services (Hands-On Experience Required)
- AWS Glue ETL: Developing and deploying Glue jobs (Python Shell and Spark), job bookmarks, dynamic frames, and custom connectors
· AWS Glue Data Catalog: Managing databases, tables, crawlers, classifiers, and schema versioning
· AWS Lake Formation: Configuring data lake permissions, fine-grained access control, and data filtering
· AWS Step Functions: Designing and implementing state machines for ETL workflow orchestration, error handling, and retry logic
· AWS Lambda: Writing serverless functions for event-driven triggers, lightweight transformations, and pipeline utilities
· Amazon Aurora: Working with Aurora PostgreSQL compatible for relational data storage and query optimization
· Amazon DynamoDB: Designing and querying NoSQL tables
· Amazon S3: Proficiency in S3 data lake design — partitioning strategies, storage classes, lifecycle policies, and S3 event notifications
· AWS IAM: Understanding of roles, policies, and least-privilege access patterns relevant to data pipeline security
- ETL Development & Migration
· Informatica PowerCenter (working knowledge): Ability to read and interpret Informatica workflows, sessions, mappings, and transformations to support conversion to AWS Glue
· ETL framework development: Experience building reusable, configurable ETL frameworks with logging, error handling, retry mechanisms, and metadata-driven execution
· Data pipeline design patterns: Familiarity with incremental loads, CDC (Change Data Capture), full loads, and SCD (Slowly Changing Dimensions)
· SQL Server (working knowledge): Ability to understand SQL Server schemas, stored procedures, and SSIS packages for migration analysis
5. Data Engineering Best Practices
· Data quality and validation: Implementing data quality checks, reconciliation logic, and exception handling within pipelines
· Metadata-driven frameworks: Building configurable pipelines driven by metadata stored in Aurora or DynamoDB
· Logging and observability: Integrating CloudWatch logging, custom metrics, and alerting into data pipelines
· Unit and integration testing: Writing test cases for ETL logic using frameworks such as pytest
· Version control: Proficiency with Git for source code management, branching strategies, and code reviews
What you need to have to be considered
· 8+ years in IT related role
· Strong hands on experience in AWS Cloud, SQL and Python
· Good experience with Kafka/Flink, AWS Glue and Airflow
Work model: On Site
At Cognizant, we strive to provide flexibility wherever possible, and we are here to support a healthy work-life balance though our various wellbeing programs. Based on this role’s business requirements, this is an onsite position requiring 5 days a week in a client or Cognizant office in Charlotte, NC.