Posted 1mo ago

Cloud Data Engineer (50576)

@ GAP Solutions
Atlanta, Georgia, United States
OnsiteFull Time
Responsibilities:Monitor pipelines, Collaborate teams, Develop platforms
Requirements Summary:Bachelor's degree; 3+ years ETL with Azure; 2+ years source control (Git/Atlassian); Azure Batch; SQL; Python or R; Azure Data Factory; strong problem-solving and teamwork.
Technical Tools Mentioned:Azure Data Factory, Azure Batch, Azure Synapse Analytics, Azure Databricks, SQL, Git, Python, R, Pandas, NumPy, Polars, Tidyverse, Airflow
Save
Mark Applied
Hide Job
Report & Hide
Job Description

Position Objective:  A key DMI objective is to "expand foundational infrastructure to provide scalable, flexible services for timely and appropriate access to actionable data in the public health ecosystem." Currently, public health programs operating across CDC have myriad investments in divergent and overlapping systems to collect, process, and analyze data to support public health decision making and administrative functions. Systems are of varying age, complexity, and quality and this creates a burden for public health partners to provide and use data, for programs to use their data, and for CDC to secure shared data with its partners and deidentified data with the public.



 



EDAV helps alleviate this problem by designing, developing, and operating shared, enterprise data services to help programs modernize and integrate these services with their existing and planned systems. However, EDAV needs to expand the quantity and quality of these services and assist programs to integrate their systems with EDAV to create new public health data products using the shared EDAV platform. Data products include data collections, storage, reports, dashboards, metadata collection, analytics (including artificial intelligence [AI]/machine learning [ML]), public use data, indicators, measures, and decision-making systems.



 



CDC’s Center for Forecasting Outbreaks and Analytics (CFA) is tasked with collaborating with internal and external partners to track public health event disease outbreaks and forecast their directions. To do this, CFA needs to extend EDAV’s capabilities to cloud spaces where it can collaborate with CDC and non-CDC groups to share data, develop machine learning models, exchange models and algorithms, and jointly author analytics and visualization products. Absent this, CFA cannot perform its mission.



 



 



 



Duties and Responsibilities:




  • Working with a multi-disciplinary team of scientists, data engineers, developers, and data consumers in a fast-paced, Agile environment

  • Monitor and optimize data pipelines for performance, scalability, and cost-effectiveness

  • Opportunity to sharpen skills in analytical exploration and data examination while support the assessment, design, developing, and maintenance of scalable platforms for the clients