Responsibilities
API Integration & Data Extraction
- Lead the integration of Fenergo APIs to extract relevant KYC and AML data, ensuring seamless connectivity and data flow between systems
- Design, develop, and maintain scalable data pipelines and ETL processes to support data ingestion from various sources, including databases, APIs, and flat files
- Ensure robust data extraction processes that maintain data quality and compliance with regulatory requirements
Data Processing & Pipeline Development
- Utilize Databricks and Apache Spark to design and implement robust data processing pipelines, ensuring high data quality and performance
- Work with DataFrames for transforming data and implementing the Medallion Architecture
- Execute SQL queries for data extraction, manipulation, and complex data operations
- Join datasets and add fields to reports to provide comprehensive analytical insights
- Leverage AI tools in Databricks to assist with data workflows and optimization
- Use notebooks as data transformation pipelines for efficient data processing
Data Analysis & Interpretation
- Analyze and interpret complex data sets to identify trends, patterns, and anomalies that can inform business decisions related to client and investor lifecycle management
- Understand and navigate the data model to ensure accurate data representation and reporting
- Conduct regular data quality assessments and audits to ensure data integrity and compliance with industry standards
- Perform root cause analysis to swiftly identify data issues and collaborate with relevant teams to implement effective solutions
Data Visualisation & Reporting
- Architect and develop interactive dashboards and reports in Power BI, translating complex data into clear, actionable insights for clients, leadership, and stakeholders
- Develop and maintain dashboards and reports to provide insights into key performance indicators (KPIs) and operational metrics for KYC/AML processes
- Create visual representations that highlight critical data points for regular reporting to clients and senior management
- Ensure reports meet the needs of both technical and non-technical stakeholders
Collaboration & Stakeholder Management
- Collaborate with cross-functional teams, including IT, Compliance, Risk Management, business analysts, and senior management, to gather data requirements and deliver strategic insights
- Engage with clients and internal stakeholders to understand their reporting needs and ensure alignment with business objectives
- Work closely with KYC/AML operations teams to ensure data solutions support compliance and regulatory requirements
- Act as a bridge between technical teams and business users, translating complex data concepts into actionable business insights
Documentation & Governance
- Maintain comprehensive documentation of data processes, API integrations, data flows, data management processes, and reporting solutions for future reference and compliance
- Document data governance practices and ensure adherence to data quality best practices
- Ensure all data handling complies with regulatory standards and internal policies
Continuous Improvement & Problem-Solving
- Recommend long-term product solutions to enhance data quality, accessibility, and usability
- Identify opportunities for process optimization and automation in data workflows
- Stay up-to-date with industry trends and best practices in data engineering, analysis, and management
- Proactively identify and resolve data-related issues, ensuring timely and accurate reporting
- Demonstrate creativity and insightfulness in developing dynamic approaches to complex data challenges
Quality Assurance
- Ensure data integrity throughout all pipelines and reporting mechanisms
- Implement data validation and quality control measures
- Monitor data processes and implement control mechanisms to ensure reliability
Skills
Core Data Engineering Skills (Required):
- Proficiency in Databricks and Apache Spark for data processing and pipeline development
- Strong knowledge of Power BI for data visualization and reporting, with ability to create executive-level dashboards
- Expert-level proficiency in SQL for data querying, manipulation, and complex analytical operations
- Experience with programming languages such as Python or R for data analysis and automation
- Strong understanding of data warehousing concepts and ETL processes
- Knowledge of data modeling concepts and best practices for data management
- Understanding of the Medallion Architecture and data lakehouse principles
- Experience working with DataFrames for data transformation
- Ability to leverage AI tools in Databricks to optimize data workflows
API & Integration Skills (Required):
- Strong experience in API integration for data extraction and system connectivity
- Ability to ensure seamless data flow between multiple systems
Cloud & Infrastructure (Required):
- Experience with cloud platforms, particularly Azure or AWS
- Knowledge of Git connection to Databricks for version control
- Experience with AWS/Azure and Databricks integration/mounting
- Understanding of data governance and data quality best practices
Additional Technical Skills (Preferred):
- Databricks administration skills
- Familiarity with machine learning concepts and their application in data analysis
- Experience with graph data models
- Understanding of data governance and compliance standards (GDPR, AML regulations, etc.)
- Knowledge of secure data handling practices