Posted 1w ago

Data Engineer - Databricks

@ Steampunk
McLean, Virginia, United States
OnsiteFull Time
Responsibilities:Lead migrations, Assess ETL, Address inquiries
Requirements Summary:2-4 years data engineering experience; Databricks; SQL; PySpark/Python; AWS; ability to design and implement data platforms and pipelines with client focus.
Technical Tools Mentioned:Databricks, Apache Spark, Delta Lake, SQL, PySpark, Python, AWS, Airflow
Save
Mark Applied
Hide Job
Report & Hide
Job Description

We are looking for seasoned a Data Engineer to work with our team and our clients to develop enterprise grade data platforms, services, and pipelines in Databricks. We are looking for more than just a "Data Engineer", we are looking for a technologist with excellent communication and customer service skills and a passion for data and problem solving. 

 

  • Lead and architect migrations of data using Databricks with focus on performance, reliability, and scalability. 
  • Assess and understand ETL jobs, workflows, data marts, BI tools, and reports 
  • Address technical inquiries concerning customization, integration, enterprise architecture and general feature/functionality of data products 
  • Experience working with database/data warehouse/data mart solutions in cloud (Preferably AWS. Alternatively Azure, GCP). 
  • Key must have skill sets – Databricks, SQL, PySpark/Python, AWS 
  • Support an Agile software development lifecycle 
  • You will contribute to the growth of our AI & Data Exploitation Practice! 

Required

  • Ability to hold a position of public trust with the US government. 
  • 2-4 years industry experience coding commercial software and a passion for solving complex problems.  
  • 2-4 years direct experience in Data Engineering with experience in tools such as: 
    • Big data tools: Databricks, Apache Spark, Delta Lake, etc. 
    • Relational SQL (Preferably T-SQL. Alternatively pgSQL, MySQL). 
    • Data pipeline and workflow management tools: Databricks Workflows, Airflow, Step Functions, etc. 
    • AWS cloud services: Databricks on AWS, S3, EC2, RDS (or Azure equivalents). 
    • Object-oriented/object function scripting languages: PySpark/Python, Java, C++, Scala, etc. 
    • Experience working with Data Lakehouse architecture and Delta Lake/Apache Iceberg.
  • Advanced working SQL knowledge and experience working with relational databases, query authoring and optimization (SQL) as well as working familiarity with a variety of databases. 
  • Experience manipulating, processing, and extracting value from large, disconnected datasets. 
  • Ability to inspect existing data pipelines, discern their purpose and functionality, and re-implement them efficiently in Databricks. 
  • Experience manipulating structured and unstructured data. 
  • Experience architecting data systems (transactional and warehouses). 
  • Experience the SDLC, CI/CD, and operating in dev/test/prod environments. 
  • Experience with data cataloging tools such as Informatica EDC, Unity Catalog, Collibra, Alation, Purview, or DataZone is a plus. 
  • Commitment to data governance. 
  • Experience working in an Agile environment. 
  • Experience supporting project teams of developers and data scientists who build web-based interfaces, dashboards, reports, and analytics/machine learning models.