We are looking for seasoned Senior Data Engineer to work with our team and our clients to develop enterprise grade data platforms, services, and pipelines. We are looking for more than just a "Senior Data Engineer", but a technologist with excellent communication and customer service skills and a passion for data and problem solving.
Responsibilities include:
- Lead and architect migration of data environments with performance and reliability
- Assess and understand the ETL jobs, workflows, BI tools, and reports
- Address technical inquiries concerning customization, integration, enterprise architecture and general feature / functionality of data products
- Experience in crafting database / data warehouse solutions in cloud (Preferably AWS. Alternatively Azure, GCP)
- Support an Agile software development lifecycle
- You will contribute to the growth of our Data Exploitation Practice!
Required:
- Ability to hold a position of public trust with the US government
- Bachelor's degree and 10+ years of work experience; OR
- No degree and 14+ years of work experience; OR
- Associate's degree and 12+ years of work experience; OR
- Master's degree and 8+ years of work experience
- Experience in Data Engineering, building scalable data pipelines and platforms.
- Strong programming experience in Python, Java, or Scala.
- Advanced SQL expertise, including query optimization, data modeling, and working with large-scale relational databases.
- Hands-on experience with both relational (e.g., PostgreSQL) and NoSQL databases (e.g., Cassandra, DynamoDB, MongoDB).
- Proven experience designing and building end-to-end data pipelines that ingest, transform, and unify data from multiple disparate sources.
- Experience working with structured, semi-structured, and unstructured data (e.g., JSON, logs, documents, PDFs), including parsing and normalization.
- Experience with data pipeline orchestration tools such as Airflow, Luigi, or equivalent.
- Strong understanding of data architecture patterns, including layered approaches such as Medallion Architecture (Bronze/Silver/Gold).
- Experience designing unified data platforms that consolidate data from multiple formats and sources into an analytics-ready foundation.
- Experience with data modeling, schema design, and handling schema evolution.
- Experience with cloud platforms, preferably AWS (e.g., S3, Glue, EMR, Lambda, Redshift), or equivalent.
- Experience ensuring data quality, integrity, and reliability, including validation and monitoring practices.
- Ability to work with large, complex, and disconnected datasets and derive meaningful structure.
- Experience working in Agile environments and collaborating with cross-functional teams.
Preferred:
- Experience with search and indexing technologies such as Elasticsearch, OpenSearch, Solr, or Lucene.
- Experience integrating legacy systems with modern cloud-based data platforms, including data migration and transformation.
- Familiarity with data governance, lineage, and metadata management practices.
- Experience building data systems that support user-facing applications, analytics platforms, or search systems.
- Experience working in design-driven, cross-functional teams (product, UX, engineering).
- Experience supporting AI/ML workflows, including feature engineering and data preparation.
Nice to Have:
- Experience with vector databases and embeddings for semantic search or retrieval use cases (e.g. Weaviate, Pinecone, PostgreSQL pgvector).
- Familiarity with LLM-powered systems, including Retrieval-Augmented Generation (RAG) or natural language search.
- Experience with graph databases (e.g., Neo4j, Amazon Neptune) and modeling complex relationships.
- Experience designing systems for modern search or information retrieval use cases.
- Experience with CI/CD pipelines and modern DevOps practices.
- Experience with distributed data processing frameworks such as Spark (PySpark preferred).
- Familiarity with streaming or real-time data processing (e.g., Kafka, Spark Streaming).