This is a remote position.
Location: Currently remote; may transition to onsite in the futureAbout the Role
Traditional QA doesn’t work for AI.
We’re looking for a AI QA Engineer who understands that AI systems are probabilistic, non-deterministic, and failure-prone by nature — and knows how to test them anyway.
You will own AI quality, safety, reliability, and regression testing across agentic systems and AI-powered SaaS products.
What You’ll Be Doing
AI-Specific Testing & Validation
- Design and execute AI-specific test strategies, including:
- Prompt robustness testing
- Hallucination detection
- Output consistency checks
- Edge-case and adversarial testing
- Prompt robustness testing
- Validate RAG pipelines:
- Retrieval accuracy
- Context relevance
- Response grounding
- Retrieval accuracy
- Test multi-agent workflows and tool integrations
Automation & Tooling
- Build automated AI test harnesses
- Create evaluation pipelines for AI responses
- Define regression tests for prompt and agent changes
- Integrate AI testing into CI/CD pipelines
Collaboration & Quality Ownership
- Work closely with AI developers during design, not just after implementation
- Help define acceptance criteria for AI features
- Monitor AI behavior in production and flag drift or degradation
- Document AI failure modes and mitigations
Requirements
Must Have
- 4+ years of QA or testing experience
- Strong API testing and automation skills
- Understanding of AI/LLM behavior and limitations
- Experience testing non-deterministic systems
- Strong analytical and problem-solving mindset
Nice to Have
- Experience testing AI agents or RAG systems
- Experience writing Python-based test automation
- Familiarity with LLM APIs and prompt engineering
- Experience with observability tools for AI systems