SDET- II

5 Minutes ago • 4 Years +
Automation

Job Description

The SDET-II will own the end-to-end qualification lifecycle for AI/LLM systems, designing and implementing scalable automated test suites across unit, integration, regression, and system levels. This role involves building and enhancing frameworks to test, evaluate, and continuously improve complex AI and LLM workflows. The SDET-II will lead the design and automation of LLM-powered features, develop evaluation pipelines for factual accuracy, hallucination rates, bias, robustness, and overall model reliability, and define metrics-driven quality gates. Collaboration with agile engineering teams, developing monitoring systems for production quality, and championing quality engineering practices are also key responsibilities.
Good To Have:
  • Experience with LLM testing & evaluation tools: MaximAI, OpenAI Evals, TruLens, Promptfoo, LangSmith.
  • Experience building LLM-powered apps: prompt pipelines, embeddings, RAG, AI workflows.
  • Experience with CI/CD design for application + LLM testing.
  • Experience with API, performance, and system testing.
  • Familiarity with Git, Docker, and cloud platforms (AWS / GCP / Azure).
  • Experience with Bias, fairness, hallucination detection & AI safety testing.
  • Mentorship and cross-functional leadership skills.
  • Strong analytical, leadership, and teamwork skills.
  • Excellent communication and collaboration across teams.
Must Have:
  • Own the end-to-end qualification lifecycle for AI/LLM systems from ideation and implementation to CI/CD integration.
  • Design and implement scalable automated test suites across unit, integration, regression, and system levels.
  • Build and enhance frameworks to test, evaluate, and continuously improve complex AI and LLM workflows.
  • Lead the design and automation of LLM-powered features, including prompt pipelines, RAG workflows, and AI-assisted developer tools.
  • Develop evaluation pipelines to measure factual accuracy, hallucination rates, bias, robustness, and overall model reliability.
  • Define and enforce metrics-driven quality gates and experiment tracking workflows.
  • Collaborate with agile engineering teams, participating in design discussions, code reviews, and architecture decisions.
  • Develop monitoring and alerting systems to track LLM production quality, safety, and performance in real time.
  • Conduct robustness, safety, and adversarial testing to validate AI behavior under edge cases and stress scenarios.
  • Continuously improve frameworks, tools, and processes for LLM reliability, safety, and reproducibility.
  • Bachelor’s or Master’s in Computer Science, Engineering, or equivalent.
  • 4+ years in software development, SDET, or QA automation.
  • Proficiency in GoLang, Java, or Python.
  • Proven experience building test automation frameworks.
  • Proven ability to design CI/CD pipelines with automated regression and evaluation testing.
  • Hands-on exposure to LLMs, GenAI applications.
  • 2+ years of hands-on experience with LLM APIs and frameworks (OpenAI, Anthropic, Hugging Face).
  • Proficient in prompt engineering, embeddings, RAG, and LLM evaluation metrics.
Perks:
  • Highly engaging and rewarding workplace
  • Tons of awesome perks
  • Many opportunities for growth

Add these skills to join the top 1% applicants for this job

team-management
cross-functional
communication
github
talent-acquisition
game-texts
quality-control
agile-development
test-suites
user-experience-ux
aws
azure
ci-cd
docker
git
python
java

What’s in it for you?

  • Own the end-to-end qualification lifecycle for AI/LLM systems from ideation and implementation to CI/CD integration.
  • Design and implement scalable automated test suites across unit, integration, regression, and system levels.
  • Build and enhance frameworks to test, evaluate, and continuously improve complex AI and LLM workflows.
  • Lead the design and automation of LLM-powered features, including prompt pipelines, RAG workflows, and AI-assisted developer tools.
  • Develop evaluation pipelines to measure factual accuracy, hallucination rates, bias, robustness, and overall model reliability.
  • Define and enforce metrics-driven quality gates and experiment tracking workflows to ensure consistent, data-informed releases.
  • Collaborate with agile engineering teams, participating in design discussions, code reviews, and architecture decisions to drive testability and prevent defects early (“shift left”).
  • Develop monitoring and alerting systems to track LLM production quality, safety, and performance in real time.
  • Conduct robustness, safety, and adversarial testing to validate AI behavior under edge cases and stress scenarios.
  • Continuously improve frameworks, tools, and processes for LLM reliability, safety, and reproducibility.
  • Mentor junior engineers in AI testing, automation, and quality best practices.
  • Measure and improve Developer Experience (DevEx) through tools, feedback loops, and automation.
  • Champion quality engineering practices across the organization, ensuring delivery meets business goals, user experience, cost of operations etc.

We’d love to hear from you, if you:

  • LLM testing & evaluation tools: MaximAI, OpenAI Evals, TruLens, Promptfoo, LangSmith
  • Building LLM-powered apps: prompt pipelines, embeddings, RAG, AI workflows
  • CI/CD design for application + LLM testing
  • API, performance, and system testing
  • Git, Docker, and cloud platforms (AWS / GCP / Azure)
  • Bias, fairness, hallucination detection & AI safety testing
  • Mentorship and cross-functional leadership

Preferred Qualifications

  • Bachelor’s or Master’s in Computer Science, Engineering, or equivalent.
  • 4+ years in software development, SDET, or QA automation.
  • Proficiency in GoLang, Java, or Python.
  • Proven experience building test automation frameworks.
  • Proven ability to design CI/CD pipelines with automated regression and evaluation testing.
  • Hands-on exposure to LLMs, GenAI applications.
  • 2+ years of hands-on experience with LLM APIs and frameworks (OpenAI, Anthropic, Hugging Face).
  • Proficient in prompt engineering, embeddings, RAG, and LLM evaluation metrics.
  • Strong analytical, leadership, and teamwork skills.
  • Excellent communication and collaboration across teams.

Our culture & accolades

As an organization, it’s our priority to create a highly engaging and rewarding workplace. We offer tons of awesome perks and many opportunities for growth.

Our culture reflects our employee's globally diverse backgrounds along with our commitment to our customers, and each other, and a passion for excellence. We live up to our values, DAB, Delight your customers, Act as a Founder, and Better Together.

Mindtickle is proud to be an Equal Opportunity Employer.

All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, national origin, disability, protected veteran status, or any other characteristic protected by law.

Your Right to Work - In compliance with applicable laws, all persons hired will be required to verify identity and eligibility to work in the respective work locations and to complete the required employment eligibility verification document form upon hire.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Set alerts for more jobs like SDET- II
Set alerts for new jobs by Mindtickle
Set alerts for new Automation jobs in India
Set alerts for new jobs in India
Set alerts for Automation (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙