We are looking for seasoned a Data Engineer to work with our team and our clients to develop enterprise grade data platforms, services, and pipelines in Databricks. We are looking for more than just a "Data Engineer", we are looking for a technologist with excellent communication and customer service skills and a passion for data and problem solving.
- Lead and architect migrations of data using Databricks with focus on performance, reliability, and scalability.
- Assess and understand ETL jobs, workflows, data marts, BI tools, and reports
- Address technical inquiries concerning customization, integration, enterprise architecture and general feature/functionality of data products
- Experience working with database/data warehouse/data mart solutions in cloud (Preferably AWS. Alternatively Azure, GCP).
- Key must have skill sets – Databricks, SQL, PySpark/Python, AWS
- Support an Agile software development lifecycle
- You will contribute to the growth of our AI & Data Exploitation Practice!
Required
- Ability to hold a position of public trust with the US government.
- 2-4 years industry experience coding commercial software and a passion for solving complex problems.
- 2-4 years direct experience in Data Engineering with experience in tools such as:
- Big data tools: Databricks, Apache Spark, Delta Lake, etc.
- Relational SQL (Preferably T-SQL. Alternatively pgSQL, MySQL).
- Data pipeline and workflow management tools: Databricks Workflows, Airflow, Step Functions, etc.
- AWS cloud services: Databricks on AWS, S3, EC2, RDS (or Azure equivalents).
- Object-oriented/object function scripting languages: PySpark/Python, Java, C++, Scala, etc.
- Experience working with Data Lakehouse architecture and Delta Lake/Apache Iceberg.
- Advanced working SQL knowledge and experience working with relational databases, query authoring and optimization (SQL) as well as working familiarity with a variety of databases.
- Experience manipulating, processing, and extracting value from large, disconnected datasets.
- Ability to inspect existing data pipelines, discern their purpose and functionality, and re-implement them efficiently in Databricks.
- Experience manipulating structured and unstructured data.
- Experience architecting data systems (transactional and warehouses).
- Experience the SDLC, CI/CD, and operating in dev/test/prod environments.
- Experience with data cataloging tools such as Informatica EDC, Unity Catalog, Collibra, Alation, Purview, or DataZone is a plus.
- Commitment to data governance.
- Experience working in an Agile environment.
- Experience supporting project teams of developers and data scientists who build web-based interfaces, dashboards, reports, and analytics/machine learning models.