The Mission:
As a Tier 3 - Support Engineer, you will own the most challenging technical escalations from our support team. But your mission goes beyond fixing what's broken. You are the critical bridge between our users and our product development teams. You will translate the complex, messy reality of customer issues into precise, actionable engineering tasks. By streamlining communication and working cross-functionally, you will ensure that hard-won insights from the field directly shape the evolution of our platform, preventing future problems and building a more resilient product.
What You’ll Actually Do:
- Own the Escalation: Serve as the final internal point of contact for complex technical issues that Tier 1 & 2 cannot resolve. Take end-to-end ownership of critical incidents.
- Engineer Solutions, Not Just Workarounds: Perform root cause analysis on complex system defects, data inconsistencies, and performance bottlenecks within the Graph RAG architecture.
- Master the Architecture: Develop an expert-level understanding of our patented reView platform, including its graph database schemas, LLM integration points, vector search logic, and API layer.
- Own the Technical Resolution of Complex Customer Issues: By diagnosing root causes, reproducing bugs, and implementing or specifying fixes – not just documenting and handing off to Engineering. Translate findings into actionable engineering inputs only when escalation is truly required.
- Drive Cross-Functional Issue Resolution: By working directly with Support, Product, and Engineering to prioritize, validate, and co-own fixes. Ensure critical context is preserved, but take personal responsibility for delivering engineered solutions rather than acting solely as a communication channel.
- Elevate the Team: Create advanced troubleshooting playbooks, update internal documentation, and provide technical mentorship to Tier 1 and Tier 2 to help them close the knowledge gap.
- Champion Systemic Reliability: Analyze support trends to identify recurring issues and propose architectural or code-level changes to prevent them.
- Manage Critical Incidents: Lead technical bridges during high-severity outages, coordinating with cloud engineering and development teams to restore service and communicate status.
- Deep-Dive Debugging: Use advanced tooling to trace issues through the entire stack—from the user interface, through the API, into the graph queries, and down to the underlying cloud infrastructure.
- Ensure Compliance & Security: Troubleshoot issues within the context of secure, regulated environments, ensuring all solutions adhere to FedRAMP, CMMC, SOC2, and other compliance frameworks.
What Makes You Dangerous (In a Good Way):
Must-Haves
- Deep Technical Expertise: 3+ years of experience in a support engineering, SRE, or DevOps role, with a proven ability to troubleshoot complex, distributed systems.
- Systems Thinking: Ability to diagnose issues not just by looking at a single error, but by understanding the interplay between APIs, graph databases (Neo4j/Memgraph), vector indexes, and cloud infrastructure.
- Query & Code Proficiency: Strong proficiency in querying languages (Cypher, SQL) and a working knowledge of a programming language (Python, Java, or Go) to read logs, understand stack traces, and potentially write small scripts for data analysis.
- Cloud & Infrastructure Knowledge: Hands-on experience with AWS or Azure, including familiarity with networking concepts, IAM roles, and services like EC2, S3, or Lambda for troubleshooting deployment issues.
- Analytical Debugging: Mastery of debugging tools and techniques (e.g., log aggregators, API testers like Postman, browser dev tools, network analyzers) to isolate the root cause of failures.
- Communication with Precision: Ability to clearly document complex technical findings for both engineering stakeholders and, when necessary, translate the resolution for non-technical users or account managers.
- Calm Under Fire: The ability to lead technical troubleshooting during high-pressure incident scenarios without losing sight of the bigger picture.
Game-Changers
(Nice-to-Haves)
- Graph Database Expertise: Deep experience with graph data models
and query optimization in Neo4j, Memgraph, or similar. - ML / LLM Ops Familiarity: Understanding of the LLM lifecycle,
vector databases, prompt engineering, and common failure modes of RAG
(Retrieval-Augmented Generation) systems. - Experience with Observability Stacks: Proficiency with tools like Datadog,
Grafana, OpenSearch, or Splunk for proactive monitoring and log analysis. - Compliance & Clearance: Familiarity with FedRAMP, NIST 800-53,
or CMMC. An active TS/SCI clearance is a plus for US-based candidates. - Platform Expertise: Experience with Jira, Confluence, and
integrating customer feedback directly into the development workflow. - Certifications: AWS Certified Solutions Architect, or
equivalent advanced cloud certification.