Key Responsibilities
Incident & Problem Management
Provide expert-level support for NetApp storage environments, including performance, capacity, data protection, and interoperability issues.
Own NetApp incidents end-to-end from triage through resolution, including clear customer communication.
Act as escalation point for Sev 1/2 storage incidents to rapidly restore service and protect critical business workloads.
Perform root cause analysis for major storage incidents and implement corrective and preventive actions.
Hardware & RMA Coordination
Lead and coordinate hardware replacement processes (RMAs, drive replacements, controllers, shelves) with NetApp and logistics partners.
Manage end-to-end RMA lifecycle: identification, RMA creation, shipment tracking, onsite installation coordination, and post-replacement validation.
Ensure accurate documentation of RMA activities, asset updates, and configuration changes within customer and internal systems.
Proactive Health, Capacity & Best Practice Reviews
Use NetApp tools (e.g., Active IQ) to identify risks, configuration deviations, and performance/capacity issues.
Conduct periodic environment health and best-practice reviews, presenting findings and remediation plans to customers.
Recommend and support configuration optimizations, firmware/ONTAP upgrades, and data layout improvements.
Change & Upgrade Support
Plan and support NetApp-related change windows (e.g., ONTAP upgrades, shelf additions, hardware refreshes).
Create and review change plans, rollback procedures, and validation checklists for storage changes.
Ensure changes are executed safely and documented thoroughly.
Customer & Cross-Vendor Collaboration
Serve as a trusted technical advisor to customers on NetApp and storage-related architecture decisions.
Collaborate with networking, virtualization, and backup teams to resolve multi-domain issues (e.g., VMware + NetApp, SAN fabric, backup tools).
Coordinate with NetApp support as needed while maintaining AHEAD’s “single point of contact” for the customer.
Documentation & Continuous Improvement
Develop and maintain runbooks, troubleshooting guides, and knowledge base articles for NetApp technologies.
Identify opportunities to automate health checks, reporting, and common remediation tasks.
Mentor and cross-train other engineers on NetApp storage technologies and best practices.
Required Skills & Experience
Technical Expertise - Strong hands-on experience with:
NetApp AFF and FAS platforms
ONTAP/ONTAP Select (clustered architecture, SnapMirror/SnapVault, etc.)
E-Series and/or All SAN Arrays (as applicable)
Storage networking concepts (Fibre Channel, iSCSI, NFS, SMB)
Performance tuning and capacity planning in storage environments
Familiarity with Active IQ or similar analytics tools for storage health and risk analysis.
Experience
8+ years in storage engineering/administration/support, ideally with a heavy NetApp focus.
Experience supporting mission-critical, 24x7 enterprise environments.
Background in incident, problem, and change management (ITIL desirable).
Soft Skills
Clear, concise communication with both technical and non-technical stakeholders.
Ability to manage multiple incidents and projects simultaneously.
Strong sense of ownership and accountability for customer outcomes.
Education & Certifications (Preferred)
Relevant NetApp certifications such as:
NetApp Certified Data Administrator (NCDA)
NetApp Certified Implementation Engineer (NCIE)
Other storage/virtualization certifications (e.g., VMware, Cisco) are a plus.
Success Measures (Role-Level KPIs)
Reduction in Sev 1 incident rate per NetApp array or environment.
RMA cycle times (initiation to fully validated replacement) meeting or beating target.
Active IQ risk remediation rates and decline in critical/high-risk findings over time.
Storage environment health scores (capacity, performance, configuration compliance).
Customer satisfaction (CSAT) for NetApp-related incidents and reviews.