Accessible only to conference ticket holders.
Log in Create account Buy conference recordings
For 90 days after a conference, only paid ticket holders can watch conference videos. After that, all Gold members have access.
If you have purchased recording access and cannot see the video, please contact support.
Summary
Nielsen Norman Group (NNG) has conducted and continues to conduct extensive research testing various large language model (LLM) tools designed for research synthesis and analysis. Our goal was to determine whether these AI-powered tools could meaningfully accelerate the work of experienced UX researchers. Through rigorous testing across multiple models and specialized research tools, we’ve found that while a few tools provide modest speed improvements for experienced researchers, none come close to replacing human expertise in research synthesis and analysis. The core problem is that these tools consistently exhibit critical flaws: they hallucinate findings, fail to identify meaningful patterns in qualitative data, cannot adequately consider nuanced research questions, and produce only superficial, high-level summaries of participant behavior. What makes this particularly dangerous is that these AI-generated outputs often have the veneer of legitimate research results—they look professional and sound plausible. However, closer inspection reveals significant gaps, inaccuracies, and missed insights that would mislead stakeholders and result in poor design decisions. The appearance of competence masks fundamental limitations that make these tools unreliable for serious research work. While we’ve found several places in the research process that can benefit from LLM usage, analysis and synthesis consistently falls short. In this talk, I can share the specific research we’re doing and explain what actually works.
Key Insights
-
•
AI tools frequently produce insight-shaped outputs but often lack the rigor and accuracy of trained human researchers.
-
•
AI moderators cannot currently assess user behavior beyond spoken words, missing key usability observations like failed or inefficient tasks.
-
•
Contextual elements such as environmental interruptions are critical in research but are invisible to AI tools.
-
•
Synthetic users generated by AI tend to produce overly positive, unrealistic feedback that can mislead product teams.
-
•
AI excels at finding semantic connections and grouping codes in large, already coded qualitative datasets quickly.
-
•
Meta-analysis of large repositories using AI can uncover recurring user themes, like change aversion, much faster than manual methods.
-
•
Integrating AI with organizational systems to pull in diverse data sources improves context but requires expert setup and is not yet simple.
-
•
AI’s context window limitations cause it to forget earlier input, affecting the accuracy of multi-turn interactions.
-
•
Even trained researchers must use AI outputs cautiously, vetting insights to maintain research quality.
-
•
Effective user research depends on human synthesis, collaboration, and contextual understanding, areas where AI currently fails.
Notable Quotes
"AI can generate insights, but it does not do them as well as a moderately trained human researcher."
"There is a world of difference between what a participant says and what they actually do, and AI misses that completely."
"AI tells you what you want to hear, which is dangerous if you’re making product decisions based on synthetic feedback."
"Our job as researchers is not making reports or interviewing users; it’s providing actionable, correct insights."
"AI tools are incentivized to produce final deliverables, but that’s an output, not the essence of research."
"AI is pretty good at finding semantic patterns among codes after human researchers have done the initial coding."
"Nobody is going to be satisfied by insight-shaped answers or high-level summaries masquerading as breakthroughs."
"AI cannot notice body language, tone, or environmental context during a research session."
"Using AI to scan large archives of research is a game changer for meta-analyses, even if it’s imperfect."
"Well-set-up AI systems pulling data from multiple company sources will have more context, but it’s still limited compared to human understanding."
Or choose a question:
More Videos
"Even though we are virtual, we are still a community and obligated to treat each other with respect and kindness."
Louis Rosenfeld Bria AlexanderOpening Remarks
March 27, 2023
"We can’t say understanding users is someone else’s job; no white knight is coming to save us."
Gregg BernsteinOpportunistic Research with Gregg Bernstein
July 11, 2019
"I thrive on learning about how people overcome obstacles."
Emily EagleCan't Rewind: Radio and Retail
June 3, 2019
"People often go along with the outcome if they feel they were part of the process."
Alison Rand Sarah BrooksScaling Impact with Service Design
March 25, 2021
"Executives said it's better to go fast and fail than to go slow and succeed."
Dan SafferWhy AI projects fail (and what we can do about it)
May 14, 2025
"Accept that the first iteration won’t be 100% accessible; moving the needle bit by bit still creates huge impact."
Suzan Bednarz Hilary SunderlandAccessibilityOps for All
January 8, 2024
"I am an individual contributor living and breathing inside the organizational environment, not a manager who can move things around."
Peter LevinSolve a Problem Here, Transform a Strategy There: Research as an Occasion for Expanding Organizational Possibility
March 25, 2024
"If we stay focused on research maturity alone, we'll have problems in the coming years because the ecosystem around us is changing."
Megan Blocker Lada Gorlenko Fatimah Richmond Molly StevensWhat UX research maturity looks like and how we get there [Advancing Research Community Workshop Series]
November 9, 2023
"Legal focuses on risk mitigation; research focuses on ethical best practices."
Theresa MarwahHow Atlassian is Operationalizing Respect in Research
February 27, 2020