Posted 3mo ago

Platform SRE, Reliability Platform Engineer

@ Todyl
Denver, Colorado, United States
OnsiteFull Time
Responsibilities:Manage Kubernetes, Build automation, Improve monitoring
Requirements Summary:Experience managing production Linux systems at scale; Kubernetes; scripting (Python, Bash, or Perl); REST APIs; CI/CD; on-call incident response; networking basics; building Linux ISOs/AMIs.
Technical Tools Mentioned:Kubernetes, Linux, Python, Bash, Perl, REST APIs, CI/CD, Linux ISOs and AMIs, Networking
Save
Mark Applied
Hide Job
Report & Hide
Job Description

About The Role

 At Todyl, our Application Platform Engineering team is dedicated to building infrastructure, services and patterns that enable our application development teams to quickly and safely deploy services at the core of our security offering.  As a member of this innovative team, you will play a pivotal role in designing and engineering cutting-edge solutions that are highly performant, highly resilient and low maintenance. Your work will not only directly impact the reliability and security of our platform but also empower the engineering team to continuously push the boundaries of what's possible in the security space. 

Responsibilities:

  • As a Platform SRE, Reliability Platform Engineer, at Todyl, you will be responsible for developing tools and services that support Todyl’s Application hosting infrastructure, including but not limited to K8s and baremetal. 

  • Implement and enforce security policies, access control and system patching. 

  • Build automation to improve the reliability and reduce human interaction for Day 2 Operations. 

  • You will collaborate with product and engineering and deliver solutions that meet the needs of stakeholders and the business. 

  • Improve Application monitoring and alerting to minimize time to detect and time to restore. 

  • Participate in a weekly on-call rotation with the team and be available during off-hours for emergency pages. 

Requirements: 

  • Experience managing production Linux systems at scale 

  • MUST HAVE: Experience managing k8s and applications running on k8s. 

  • MUST HAVE: General competency in one or more scripting languages including Python, Pearl, or Bash. 

  • Working knowledge of REST APIs. 

  • Familiarity with building custom Linux ISOs and AMIs. 

  • Familiarity with networking fundamentals. 

  • Ability to quickly learn new concepts, frameworks, and technologies. 

  • Comfortable building and maintaining production services. 

  • Production experience using CI/CD for code deployment. 

  • Experience with on-call rotations and incident response processes. 

Todyl provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, transgender status, gender identity or expression, national origin, age, disability, marital status, genetic information, military status or any other status protected by applicable federal, state or local laws.