Posted 2d ago

Site Reliability Engineer

@ General Atomics Aeronautical Systems
San Diego, California, United States
$98k-$171k/yrHybridFull Time
Responsibilities:Manage monitoring, Develop dashboards, Perform health checks
Requirements Summary:Bachelor’s degree in IT or related field; 5-7 years in monitoring; Windows/Linux, virtualization, cloud, networking; SolarWinds Orion and DataDog experience; scripting; Agile; US citizenship.
Technical Tools Mentioned:SolarWinds Orion, DataDog, PowerShell, Python, VMware, Hyper-V, Azure, AWS
Save
Mark Applied
Hide Job
Report & Hide
Job Description

Job Summary

General Atomics (GA), and its affiliated companies, is one of the world’s leading resources for high-technology systems development ranging from the nuclear fuel cycle to remotely piloted aircraft, airborne sensors, and advanced electric, electronic, wireless and laser technologies.

We have an exciting opportunity for a Site Reliability Engineer to join our ITS division. This position will be located in San Diego (Rancho Bernardo), CA.

This position will be accountable for installing, configuring and administrating the company’s monitoring system.

DUTIES AND RESPONSIBILITIES:
  • Capacity management for monitoring system
  • Development of dashboards and views for Support, Management and Application teams
  • Ensuring the correct metrics are being captured for IT reporting
  • Ensure monitoring and alerting is aligned to future needs
  • Implement efficient event management processes and automation
  • Configure and maintain central monitoring platforms
  • Perform daily health checks and break/fix support on monitoring platforms
  • Manage new technology integrations into monitoring systems
  • Participate in the governance of application and infrastructure monitoring design, implementation, customization and support
  • Grow the monitoring technical skillset of everyone, including mentoring, coaching, training
  • Active involvement in projects to ensure development of additional monitoring needs from application teams
  • Troubleshooting endpoints/Agents
  • Adding new servers to monitoring environment
  • Removing Decommission servers from monitoring
  • Recommend hardware and software solutions to meet the organization's needs and define best practices
  • Provide periodic and ad-hoc reports
  • Responsible for observing all laws, regulations and other applicable obligations wherever and whenever business is conducted on behalf of the Company.  Expected to work in a safe manner in accordance with established operating procedures and practices.  
  • Other duties as assigned or as required.

We recognize and appreciate the value and contributions of individuals with diverse backgrounds and experiences and welcome all qualified individuals to apply.

 

Job Category

Information Technology

Travel Percentage Required

0% - 25%

Full-Time/Part-Time

Full-Time Salary

State

California

Pay Range Low

98,100

City

San Diego

Clearance Required?

No

Pay Range High

171,398

Recruitment Posting Title

Site Reliability Engineer

Job Qualifications

  • Typically requires a bachelor's degree in information technology or a related discipline and ten or more years of progressive professional experience in an information technology department primarily in systems administration. May substitute equivalent working experience in the field in lieu of education.
  • Strong knowledge of Windows/Linux servers, virtualization (VMware, Hyper-V), cloud platforms (Azure, AWS), networking (TCP/IP, DNS, DHCP), and storage.
  • Experience with enterprise monitoring solutions particularly SolarWinds Orion and DataDog
  • Ability to write scripts (e.g., PowerShell, Python) for automation.
  • Excellent analytical and root cause analysis skills.
  • Strong verbal and written communication.
  • Agile Methodology
  • Basic troubleshooting skills
  • Ability to write documentation
  • Ability to work with and coordinate vendors
  • Ability to troubleshoot issues and quickly determine escalation points
  • Possess both progressive technology experience and project planning acumen - ensuring on-time completion of tasks and projects in a team setting.   
  • Is nimble, forward thinking, and thoughtful.  
  • At home in a hybrid role of “roll up the sleeves” with technology teams  

Required Experience:
  • 5-7 years experience managing and supporting monitoring environments in a large enterprise environment.  
  • 7-10 years of experience supporting computer systems with at least 2-3 years of experience with Microsoft Azure and AWS.
  • 5 years of experience with various networking protocols and their use, particularly TCP/IP, DHCP, DNS and WINS.
  • Certifications in one or more of the following is a great plus:  
    • AWS  
    • Azure  
    • ITIL  
    • Agile  
  • Excellent communication skills to both technical and non-technical audience.  
  • Experience with Agile & DevOps methodologies, tools and measurement practices.  
  • Ability to multi-task and operate with a strong sense of urgency.  
  • Can-do attitude to problem solving, open-ness to evaluate and accept new ideas.  
  • US Citizenship is required. 
Physical Demands:
  • Ability to lift 50 pounds

US Citizenship Required?

Yes

Experience Level

Senior (8+ years)

Relocation Assistance Provided?

Yes

Workstyle

Hybrid