Job Title:
Associate Data Architect – Master Data Management (MDM)

Location:

Pune -
Hybrid

Experience:

10+ years of
experience in Data Architecture, Data Engineering/Integration with strong
exposure into Data Modelling and Database (RDBMS) Management.

About the
Role

We are seeking
an Associate Data/Database Architect to join our core product
architecture team building an enterprise-grade, multi-domain Master Data
Management (MDM) product platform.

You will play a
key role in optimizing and extending the MDM data model, implementing
efficient data ingestion and entity resolution mechanisms, and ensuring
the system supports multiple domains such as Party
(Individual/Organization), Product, Location, Policy, and Relationship in a cloud-native and scalable manner.

Key
Responsibilities

Data
Modeling & Architecture

Enhance and extend the existing Party-based data
model into a multi-domain MDM schema (Party, Product, Location,
Relationship, Policy, etc.).
Design and maintain canonical data models and staging-to-core
mappings for multiple source systems.
Implement auditability, lineage, and soft-delete
frameworks within the MDM data model.
Contribute to the creation of golden records, trust
scores, match/merge logic, and data survivorship rules.
Ensure the model supports real-time and batch data
mastering across multiple domains.

Data
Engineering & Integration

Help support to optimize data ingestion and
ETL/ELT pipeline using Python, PySpark, SQL, and/or Informatica (or equivalent tools).
Design and implement data validation, profiling,
and quality checks to ensure consistent master data.
Work on data harmonization, schema mapping,
and standardization across multiple source systems.
Help build efficient ETL mappings from
canonical staging layers to MDM core data models in PostgreSQL.
Develop REST APIs or streaming pipelines
(Kafka/Spark) for real-time data processing and entity
resolution.

Cloud &
Platform Engineering

Implement and optimize data pipelines on AWS or Azure using native services (e.g., AWS Glue, Lambda, S3,
Redshift, Azure Data Factory, Synapse, Data Lake).
Deploy and manage data pipelines and databases
following cloud-native, cost-effective, and scalable design
principles.
Collaborate with DevOps teams for CI/CD, infrastructure-as-code, data pipeline and database deployment/migration automation.

Governance,
Security & Compliance

Implement data lineage, versioning, and
stewardship processes.
Ensure compliance with data privacy and security
standards (GDPR, HIPAA, etc.).
Partner with Data Governance teams to define data
ownership, data standards, and stewardship workflows.

Requirements

Technical
Skills Required

Core Skills

Data Modelling: Expert-level in Relational
(3NF) and Dimensional (Star/Snowflake) modelling; hands-on in Party
data model, multi-domain MDM, and canonical models.
Database: PostgreSQL (preferred), or any
enterprise RDBMS.
ER Modelling Tool – Erwin/ERStudio, Database
Markup Language (DBML).
ETL / Data Integration: Informatica, Python,
PySpark, SQL, or similar tools.
Cloud Platforms: AWS or Azure.
Programming: Advanced SQL, Python, PySpark, and/or UNIX/Linux scripting.
Data Quality & Governance: Familiarity
with data quality rules, profiling, match/merge, and entity
resolution.
DevOps - Version Control & CI/CD: Git,
Azure DevOps, Jenkins, Terraform, Redgate Flyway (preferred)

Database
Design & Optimization (PostgreSQL)

Design and maintain normalized and denormalized
models using advanced features (schemas, partitions, views, CTEs,
JSONB, arrays).
Build and optimize complex SQL queries, materialized
views, and data marts for performance and scalability.
Tune RDBMS (PostgreSQL) performance – indexes,
query plans, vacuum/analyze, statistics, parallelism, and connection
management.
Leverage RDBMS (PostgreSQL) extensions such as:

pg_trgm for fuzzy matching and
probabilistic search.
fuzzystrmatch, pg_vector for
semantic similarity and name matching.
hstore, jsonb for flexible attribute
storage.

Implement RBAC, row-level security, partitioning,
and logical replication for scalable MDM deployment.
Work with stored procedures, functions, and
triggers for data quality checks and lineage automation.
Implement HA/DR, backup/restore,
database-level encryption (at-rest, in-transit), column-level
encryption for PII/PHI data.

Good to Have

Knowledge of Master Data Management (MDM) -
Customer, Product etc.
Experience with graph databases like Neo4j for relationship and lineage tracking.
Knowledge of probabilistic and deterministic
matching, ML-based entity resolution, or AI-driven data
mastering.
Experience in data cataloging, data lineage
tools, or metadata management platforms.
Familiarity with data security frameworks and Well-Architected
Framework principles.

Soft Skills

Strong analytical, conceptual and problem-solving
skills.
Ability to collaborate in a cross-functional,
agile environment.
Excellent communication and documentation
skills.
Self-driven, proactive, and capable of working with
minimal supervision.
Strong desire to innovate and build scalable,
reusable data frameworks.

Education

Bachelor’s or master’s degree in computer science,
Information Technology, or related discipline.
Certifications in AWS/Azure, Informatica,
or Data Architecture are a plus.

Benefits

Why Join Us

Be part of a cutting-edge MDM product initiative blending data architecture, engineering, AI/ML, and cloud-native
design.
Opportunity to shape the next-generation data
mastering framework for multiple industry domains.
Gain deep exposure to data mastering, lineage,
probabilistic search, and graph-based relationship management.
Competitive compensation, flexible working and a
technology-driven culture.

Data Architect - MDM

Requirements

Benefits