Job
Description:
We are seeking a
highly skilled Site Reliability Engineer (SRE) with deep expertise
in AWS and Windows Server environments. The ideal candidate will
be responsible for ensuring the reliability, availability, and
performance of our cloud infrastructure and overseeing the
execution of various projects in line with business
objectives.
Key
Responsibilities:
- Design, implement,
and maintain AWS infrastructure ensuring scalability and
high availability utilizing infrastructure as code
(IaC). - Manage and
optimize Windows Server environments, focusing on
security and reliability. - Collaborate with
development and operations teams to automate and streamline
processes. - Monitor system
performance and resolve issues to prevent outages. - Participate in an
on-call rotation to address urgent issues and maintain
system integrity. - Develop and
maintain documentation for system configuration and
procedures. - Develop and
implement automation scripts and tools to streamline deployment
activities.
Requirements
Required
Qualifications:
- Minimum of five
years of experience in Cloud/SRE/DevOps or a related
field. - Proven experience
with AWS services including EC2, VPC, S3, RDS, and
others. - Strong proficiency
in managing Windows Server and Linux environments. - Experience with
AWS IAM and security protocols. - Familiarity with
tools like Terraform, PowerShell, and Docker for
automation.
· Proficiency
in writing comprehensive technical documentation.
Nice to
Have:
- Expertise with
Microsoft Entra ID (Azure AD) and AWS IAM. - Knowledge of
Windows Server Remote Desktop Services on AWS. - Experience using
SAML for authentication in Windows Domains. - Familiarity with
RDS databases (Oracle and MS SQL), especially
conversion to AWS RDS. - Experience in
Identity and Access Management (IAM) across organizations
and applications.