Posted 10mo ago

DevOps Engineer

@ Ascentt
Plano, Texas, United States
OnsiteFull Time
Responsibilities:Design cloud infrastructure, Develop CI/CD pipelines, Monitor systems and incidents
Requirements Summary:Extensive experience with AWS and Azure, GitHub Actions CI/CD, Datadog monitoring, container orchestration (Kubernetes/AWS ECS), IaC (Terraform, CloudFormation), Python/Bash scripting, and security best practices.
Technical Tools Mentioned:AWS, Azure, GitHub Actions, Datadog, Terraform, CloudFormation, Kubernetes, AWS ECS, Python, Bash
Save
Mark Applied
Hide Job
Report & Hide
Job Description

Job Description (Summary of Responsibilities):

 

·       Cloud Infrastructure Management: Design, implement, and manage cloud-based infrastructure on AWS and Azure, ensuring optimal scalability, performance, and security.

·       CI/CD Pipeline Development: Develop and maintain CI/CD pipelines using GitHub Actions for automated code deployments and testing.

·       System Monitoring and Incident Management:

·       Implement and configure Datadog for comprehensive system monitoring.

·       Develop and maintain Datadog dashboards to visualize system performance and metrics.

·       Set up proactive alerts in Datadog to detect and respond to incidents swiftly, ensuring high system reliability and uptime.

·       Conduct root cause analysis of incidents and implement corrective actions using Datadog insights.

·       Collaboration with AI Teams: Work closely with AI teams to support the operational aspects of LLMs, including deployment strategies and performance tuning.

·       Infrastructure as Code (IaC): Implement IaC using tools like Terraform or AWS CloudFormation to automate infrastructure provisioning and management.

·       Container Orchestration: Manage container orchestration systems such as Kubernetes or AWS ECS.

·       Operational Support for LLMs: Provide operational support for LLMs, focusing on performance optimization and reliability.

·       Scripting and Automation: Utilize scripting languages such as Python and Bash for automation and task management.

·       Security and Compliance: Ensure compliance with security standards and best practices, implementing robust security measures.

·       Documentation: Document system configurations, procedures, and best practices for internal and external stakeholders.

·       DevOps Collaboration: Work with development teams to optimize deployment workflows, introduce best practices for DevOps, and improve overall efficiency.

·       Technology and Industry Awareness: Stay up-to-date with emerging technologies and industry trends to suggest improvements and upgrades.

 

Qualifications and Skills Required:

 

·       Extensive experience with AWS and Azure cloud platforms.

·       Proficiency in developing CI/CD pipelines using GitHub Actions.

·       Strong experience with Datadog for system monitoring, including implementation, configuration, and maintenance.

·       Demonstrated ability to create and maintain Datadog dashboards for performance visualization.

·       Proven expertise in setting up alerts and conducting incident response with Datadog.

·       Hands-on experience with container orchestration systems such as Kubernetes or AWS ECS.

·       Proficiency in Infrastructure as Code (IaC) tools like Terraform or AWS CloudFormation.

·       Familiarity with operational aspects of Large Language Models (LLMs) is highly desirable.

·       Strong scripting skills in Python, Bash, or similar languages.

·       In-depth knowledge of security standards and best practices.

·       Excellent documentation skills.

·       Proven ability to work collaboratively with development and AI teams.

·       Commitment to staying current with industry trends and emerging technologies