Diverse Lynx
Position: AI/LLM Agent QA Tester Location : Tampa, FL - Onsite Type of Hire :- Long Term Contract Summary: s a Functional Tester specializing in AI Agents , you will be responsible for designing, executing, and maintaining test cases that validate the functionality, reliability, and performance of AI-driven agent systems. You will work closely with development teams, AI specialists, and product owners to ensure that AI agents perform accurately and efficiently in real-world scenarios. Key Responsibilities:
- I/LLM-Specific QA: Designing test frameworks for LLM outputs (hallucination checks, factuality tests, toxicity/bias detection).
- utomation Frameworks: Pytest, Robot Framework, Playwright, Cypress, or custom LLM test harnesses.
- Evaluation Metrics: BLEU, ROUGE, BERT Score, GPT-based evaluators, human-in-the-loop validation.
- Data QA: Ensuring high-quality training/evaluation datasets, data validation (Great Expectations, Deequ).
- Python Automation: Scripting test cases, API test automation.
- CI/CD Integration: utomated test pipelines in Jenkins/GitHub Actions/GitLab CI.
- Performance Testing: Latency, throughput, stress testing of LLM-powered services.
- Leadership (Manager level): Define QA strategies for AI systems, lead test automation teams, enforce quality gates