Head of AI Product Evaluation & Experience
Ant Group
Software Engineering, Product, Data Science
Hangzhou, Zhejiang, China
Position Overview
As the Head of AI Product Evaluation & Experience, you will be the ultimate guardian of user experience and model alignment across our portfolio of next-generation, AI-native consumer applications. This is a pioneering leadership role where you will design, build, and scale a world-class product evaluation function from scratch.
In the AI-native era, traditional QA is obsolete. You will champion a holistic evaluation paradigm that merges Human-in-the-Loop (HITL) model alignment, subjective algorithmic grading (MOS), and end-to-end consumer UX measurement. You will lead the strategy that translates subjective human preferences into engineering-actionable insights, ensuring our AI products are high-performing, safe, intuitive, and deeply engaging.
Key Responsibilities
1. Build & Scale the Evaluation Function (From 0 to 1)
- Recruit, structure, and mentor a high-performing, agile team of Evaluation Product Managers, Crowdsourcing Specialists, and Data Annotators.
- Establish the foundational infrastructure, methodologies, and tooling required to evaluate multiple distinct AI-native consumer applications simultaneously.
2. Design the "Three-in-One" Evaluation Framework
- GenAI & Alignment Evaluation: Define the criteria and scale the operations for human evaluation, RLHF (Reinforcement Learning from Human Feedback), and red-teaming to assess AI generation quality (text, image, audio, multi-modal), factual accuracy, safety, and persona consistency.
- Algorithmic & Media Quality: Implement rigorous subjective scoring methodologies (e.g., Mean Opinion Score - MOS) to evaluate downstream algorithm performance, recommendation relevance, and multi-modal processing quality.
- Consumer UX & Engagement: Design user-centric experience metrics to measure the end-to-end user journey, interface intuitiveness, and the overall "delight" factor of AI-driven features.
3. Cross-Functional Strategic Collaboration
- Serve as the critical bridge between AI Research Labs, Infrastructure, Core Engineering, and Growth Product Teams.
- Translate highly complex, qualitative human evaluation data into standardized, quantifiable metrics and benchmarks that directly guide model fine-tuning and product iterations.
- Champion the voice of the end-user, ensuring technical advancements in AI models translate into tangible consumer value.
Qualifications & Requirements
Leadership & Experience
- 6+ years of experience in Product Management, Product Evaluation, Data Science, or Advanced QA/UX Research within fast-paced consumer internet companies.
- Proven track record of building and scaling teams from scratch, including establishing operational workflows and cultural norms.
- Experience managing cross-functional initiatives for multiple products or a diverse portfolio of app features.
Domain Expertise
- Deep familiarity with the AI-native product landscape (e.g., LLMs, GenAI tools, AI agents, multi-modal applications) and how users interact with AI.
- Strong understanding of evaluation methodologies, crowd-sourcing operations, data labeling pipelines, or behavioral research techniques.
- A strong technical background (e.g., familiarity with Python, data analysis tools, or prompt engineering) is highly preferred but not mandatory; the ability to communicate effectively with AI researchers and engineers is required.
Core Competencies
- First-Principles Thinking: Ability to define metrics and evaluation standards for ambiguous, creative, or open-ended AI features where no industry standard exists.
- Operational Excellence: Strong project management skills to oversee large-scale data labeling or user testing operations without sacrificing quality or speed.
- Exceptional bilingual communication skills (English and Chinese) to operate effectively in a globalized tech environment.