Rosenverse

Log in or create a free Rosenverse account to watch this video.

Log in Create free account

100s of community videos are available to free members. Conference talks are generally available to Gold members.

Building impactful AI products for design and product leaders, Part 2: Evals are your moat
Wednesday, July 23, 2025 • Rosenfeld Community
Share the love for this talk
Building impactful AI products for design and product leaders, Part 2: Evals are your moat
Speakers: Peter Van Dijck
Link:

Summary

The secret ingredient for impactful AI products is “evals”—an architecture for ongoing evaluation of quality. Without evals, you don’t know if your output is good. You don’t know when you’re done. Because outputs are non-deterministic, it’s very hard to figure out if you are creating real value for your users, and when something goes wrong, it’s really tricky to figure out why. Simply Put’s Peter van Dijck will demystify evals, and share a simple framework for planning for and building useful evals, from qualitative user research to automated evals using LLMs as a judge.

Key Insights

  • AI products require a systematic framework for evaluation, involving three layers: model, context, experience.

  • Automated eval systems are essential for efficiently testing AI quality with open-ended inputs and outputs.

  • Iterative testing can be very time-consuming; thus, automation helps accelerate the process without sacrificing reliability.

  • Defining 'what good looks like' is critical to the success of AI systems and requires ongoing refinement.

  • Domain expertise plays a significant role in creating effective eval systems and datasets.

Notable Quotes

"Eval systems help scale the testing of AI product quality."

"The challenge is how to evaluate when inputs and outputs are open-ended."

"You need a semi-automated system that can help you test any change you make."

"It's crazy to retest everything every time you make a change; we need faster iteration processes."

"What you're trying to do is define what is good for your system."

More Videos

Kristin Skinner

"Learning should be an experience tailored around the learner, not the curriculum."

Kristin Skinner

Five Years of DesignOps

September 29, 2021

Yalenka Mariën

"Investing in building trust is crucial before expecting citizens to trust digital services."

Yalenka Mariën Marie Mervaillie

Designing for Digital Inclusion in the Belgian Government

December 8, 2021

Erin Weigel

"About 90% of all the work that I did for booking.com ended up in the garbage can."

Erin Weigel

UX Lessons from running more than 1,200 A/B Tests

July 10, 2024

Marissa Cui

"Stripe Climate allows businesses to direct a fraction of their revenue to carbon removal solutions."

Marissa Cui Rachel He Michael Leggett Manos Saratsis

Climate Design Product Showcase

March 13, 2024

Robert Schwartz

"Healthcare is often messy, often unimpathic."

Robert Schwartz

We're Here for the Humans

June 9, 2017

Mariah Hay

"Don’t weaponize your product; think about the implications of your design decisions."

Mariah Hay

Ethics in Tech Education: Designing to Provide Opportunity for All

June 14, 2018

Kristin Skinner

"We really wanted to understand the value that our teammates would find in having a role like this."

Kristin Skinner

Theme 1 Intro

September 29, 2021

Heidi Trost

"We have to understand that dynamic in order to improve security outcomes."

Heidi Trost

To Protect People, You Have to Protect Information: A Human-Centered Design Approach to Cybersecurity

January 23, 2025

Rachael Dietkus, LCSW

"Having an equitable curation process is our foremost concern."

Rachael Dietkus, LCSW Victor Udoewa Jennifer Strickland

Everything You Need to Know about the Civic Design 2022 Call for Presentations

May 17, 2022