Posted 3mo ago

Data Architect - MDM

@ Dataeconomy
Pune City, Maharashtra, India
HybridFull Time
Responsibilities:Data modeling, Data ingestion, Data integration
Requirements Summary:10+ years in Data Architecture, Data Engineering/Integration; strong MDM, data modelling, and RDBMS management.
Technical Tools Mentioned:PostgreSQL, Python, PySpark, SQL, Informatica, AWS, Azure, Erwin, ERStudio, DBML, Git, Jenkins, Terraform, Redgate Flyway, Kafka, Spark
Save
Mark Applied
Hide Job
Report & Hide
Job Description

Job Title:
Associate Data Architect – Master Data Management (MDM)
 

 

Location: 

Pune -
Hybrid  

Experience: 

10+ years of
experience in Data Architecture, Data Engineering/Integration with strong
exposure into Data Modelling and Database (RDBMS) Management. 

 

About the
Role
 

We are seeking
an Associate Data/Database Architect to join our core product
architecture team building an enterprise-grade, multi-domain Master Data
Management (MDM) product platform
.  

You will play a
key role in optimizing and extending the MDM data model, implementing
efficient data ingestion and entity resolution mechanisms, and ensuring
the system supports multiple domains such as Party
(Individual/Organization), Product, Location, Policy, and Relationship
in a cloud-native and scalable manner. 

 

Key
Responsibilities
 

Data
Modeling & Architecture
 

  • Enhance and extend the existing Party-based data
    model
    into a multi-domain MDM schema (Party, Product, Location,
    Relationship, Policy, etc.). 
  • Design and maintain canonical data models and staging-to-core
    mappings
    for multiple source systems. 
  • Implement auditability, lineage, and soft-delete
    frameworks
    within the MDM data model. 
  • Contribute to the creation of golden records, trust
    scores
    , match/merge logic, and data survivorship rules
  • Ensure the model supports real-time and batch data
    mastering
    across multiple domains. 

 

Data
Engineering & Integration
 

  • Help support to optimize data ingestion and
    ETL/ELT pipeline
    using Python, PySpark, SQL, and/or Informatica (or equivalent tools). 
  • Design and implement data validation, profiling,
    and quality checks
    to ensure consistent master data. 
  • Work on data harmonization, schema mapping,
    and standardization across multiple source systems. 
  • Help build efficient ETL mappings from
    canonical staging layers to MDM core data models in PostgreSQL
  • Develop REST APIs or streaming pipelines
    (Kafka/Spark)
    for real-time data processing and entity
    resolution. 

 

Cloud &
Platform Engineering
 

  • Implement and optimize data pipelines on AWS or Azure using native services (e.g., AWS Glue, Lambda, S3,
    Redshift, Azure Data Factory, Synapse, Data Lake). 
  • Deploy and manage data pipelines and databases
    following cloud-native, cost-effective, and scalable design
    principles. 
  • Collaborate with DevOps teams for CI/CD, infrastructure-as-code, data pipeline and database deployment/migration automation

 

Governance,
Security & Compliance
 

  • Implement data lineage, versioning, and
    stewardship
    processes. 
  • Ensure compliance with data privacy and security
    standards
    (GDPR, HIPAA, etc.). 
  • Partner with Data Governance teams to define data
    ownership, data standards, and stewardship workflows



Requirements

Technical
Skills Required
 

Core Skills 

  • Data Modelling: Expert-level in Relational
    (3NF)
    and Dimensional (Star/Snowflake) modelling; hands-on in Party
    data model
    , multi-domain MDM, and canonical models
  • Database: PostgreSQL (preferred), or any
    enterprise RDBMS. 
  • ER Modelling Tool – Erwin/ERStudio, Database
    Markup Language (DBML). 
  • ETL / Data Integration: Informatica, Python,
    PySpark, SQL, or similar tools. 
  • Cloud Platforms: AWS or Azure. 
  • Programming: Advanced SQL, Python, PySpark, and/or UNIX/Linux scripting
  • Data Quality & Governance: Familiarity
    with data quality rules, profiling, match/merge, and entity
    resolution

  • DevOps - Version Control & CI/CD: Git,
    Azure DevOps, Jenkins, Terraform, Redgate Flyway (preferred) 

 

Database
Design & Optimization (PostgreSQL)
 

  • Design and maintain normalized and denormalized
    models
    using advanced features (schemas, partitions, views, CTEs,
    JSONB, arrays). 
  • Build and optimize complex SQL queries, materialized
    views
    , and data marts for performance and scalability. 
  • Tune RDBMS (PostgreSQL) performance – indexes,
    query plans, vacuum/analyze, statistics, parallelism, and connection
    management. 
  • Leverage RDBMS (PostgreSQL) extensions such as: 
    • pg_trgm for fuzzy matching and
      probabilistic search. 
    • fuzzystrmatch, pg_vector for
      semantic similarity and name matching. 
    • hstore, jsonb for flexible attribute
      storage. 
  • Implement RBAC, row-level security, partitioning,
    and logical replication for scalable MDM deployment. 
  • Work with stored procedures, functions, and
    triggers
    for data quality checks and lineage automation. 
  • Implement HA/DR, backup/restore,
    database-level encryption (at-rest, in-transit), column-level
    encryption for PII/PHI data. 

 

Good to Have 

  • Knowledge of Master Data Management (MDM) -
    Customer, Product etc. 
  • Experience with graph databases like Neo4j for relationship and lineage tracking. 
  • Knowledge of probabilistic and deterministic
    matching
    , ML-based entity resolution, or AI-driven data
    mastering

  • Experience in data cataloging, data lineage
    tools
    , or metadata management platforms
  • Familiarity with data security frameworks and Well-Architected
    Framework
    principles. 

 

Soft Skills 

  • Strong analytical, conceptual and problem-solving
    skills. 
  • Ability to collaborate in a cross-functional,
    agile
    environment. 
  • Excellent communication and documentation
    skills. 
  • Self-driven, proactive, and capable of working with
    minimal supervision. 
  • Strong desire to innovate and build scalable,
    reusable data frameworks

 

Education 

  • Bachelor’s or master’s degree in computer science,
    Information Technology, or related discipline. 
  • Certifications in AWS/Azure, Informatica,
    or Data Architecture are a plus. 


Benefits

Why Join Us 

  • Be part of a cutting-edge MDM product initiative blending data architecture, engineering, AI/ML, and cloud-native
    design

  • Opportunity to shape the next-generation data
    mastering framework
    for multiple industry domains. 
  • Gain deep exposure to data mastering, lineage,
    probabilistic search, and graph-based relationship management

  • Competitive compensation, flexible working and a
    technology-driven culture.