How to apply the Mom Test in user interviews

How to apply the Mom Test in user interviews so participants reveal real behavior, not the polite version they think you want to hear.

Rizvi Haider··21 min read·Updated May 30, 2026

The interview ends with the customer effectively hugging you. "I love this. I'd absolutely use it. When are you launching?" You write STRONG INTEREST in your notes and paste the quote into the team channel. Three weeks later, the launch email to the same person is not opened. Six conversations of this shape get summarized into "the market wants it" inside a board update, and the company spends two quarters building something nobody pays for. None of the customers were lying on purpose. They were failing the Mom Test, the framework Rob Fitzpatrick published in 2013 to describe exactly this failure mode and reissued in the 2026 revised edition.

This is a working guide on how to apply the Mom Test in user interviews for product teams: what the framework actually is, the three classes of bad data it exists to detect, the six steps that produce interviews resistant to false validation, and how the same rules carry over to async research where the interviewer is an AI rather than a person across the table. It is the version we use internally at Talkful, in the format that holds up after the first round of contradicting evidence comes in.

What the Mom Test is (and what it is for)

The Mom Test is a set of three rules for customer interviews, named after the observation that even your mother will lie to protect your feelings if you ask her whether your business idea is any good. The rules, in Rob Fitzpatrick's original phrasing on the official book site, are: talk about their life instead of your idea, ask about specifics in the past instead of generics or opinions about the future, and talk less and listen more. The framework is taught as a core text at Harvard, MIT, and University College London, and has been adopted as training material at Shopify and Skyscanner among others.

The purpose of the Mom Test is narrow and important. It is not a method for getting people to like your idea. It is a method for getting them to reveal a behavior, a constraint, or a switch event that would predict whether they will actually use the thing you build. The interview is treated as an evidence-gathering exercise, not a sales conversation. A successful Mom Test interview can end with the customer saying they hate the idea and still produce more usable signal than an interview that ended with a hug, because the hate is anchored to specifics and the hug was anchored to nothing.

The framework predates AI-moderated research by a decade, but the rules are modality-agnostic. They apply to a live customer interview, to a written survey, to a voice-note study, and to an async session moderated by an AI interviewer. The pressure to produce false positives is structural to the conversation itself, not to the medium that carries it.

The three classes of bad data the Mom Test exists to detect

Fitzpatrick's contribution is not that customers lie. Everyone running interviews knows customers lie. His contribution is the taxonomy of how they lie, which lets a researcher catch the lie in flight and redirect the conversation before the bad data hits the synthesis pass. Three patterns, each requiring a different recovery.

Compliments are the easiest to spot and the hardest to discount. "This is great." "I love the design." "You're onto something." None of these are evidence. They are the participant being polite or being supportive or being genuinely impressed by your enthusiasm. The recovery is to ignore the compliment in the transcript and immediately ask a behavioral follow-up: "When was the last time you ran into this problem?" The compliment teaches the team nothing. The story about the last time it happened is the data.

Hypothetical fluff is harder. "I would definitely use that." "If you built X, I'd switch." "Our team would love that feature." These are predictions about a future behavior the participant has no incentive to commit to. Behavioral research has shown for decades that people are terrible at predicting their own future behavior, particularly in low-stakes hypothetical contexts. The recovery is to anchor to the past: "When was the last time you needed something like that? What did you actually end up doing?" The future intention evaporates; the past behavior remains.

Wishlists are the most insidious because they sound like product roadmap input. "It would be perfect if it had a Slack integration." "I'd love a mobile app." "The killer feature for us would be SSO." Wishlists are not user research; they are uninformed solutioning. The participant is doing the team's job badly, because they do not understand the system constraints, the build cost, or the tradeoffs. The recovery is to ask why: "What would the Slack integration let you do that you cannot do today?" Underneath the wishlist is usually a real workflow problem; on top of it is a feature spec the team should never ship as-described.

A Mom Test interview is one in which the team trains itself to detect these three patterns in real time, redirect each one, and walk away with transcripts that contain past behavior, past constraints, and past switch events. The rest of this guide is the operational version of that training.

How to apply the Mom Test in user interviews, step by step

Six steps. The order is important: most teams attempt step three (probing what surprised them) without doing step one (anchoring to a past episode), and the conversation collapses into hypothetical fluff before the probe has anything to land on.

01 · Anchor every question to a specific past episode

The first rule of the Mom Test is to talk about their life, not your idea, and the operational form of that rule is to anchor every prompt to a specific past episode. "Tell me about the last time" beats "do you typically" every time, because the last time has a date and a context and a sequence of actions, while typically invites the participant to produce a cleaned-up generic narrative that omits the parts the team most needs.

The prompts that work, in the shape Fitzpatrick uses and we have inherited:

  • "Walk me through the last time you ran into {problem}."
  • "When was the last time you tried to solve {problem}? What did you do?"
  • "Tell me about the day you decided to {action}."
  • "What was happening the week before you {switch}?"

Each of those returns a story with a timestamp, an actor, and a sequence. None of them ask for an opinion about your product. The participant's first answer is usually still half-generic ("I usually just...") and the probe is to redirect: "Specifically the last time. What day was it?" Two or three redirects in, you have a timeline you can analyze.

This is the same anchoring principle that jobs to be done interviews use for the switch event, and it sits underneath the prompt-craft pattern documented in how to write user research questions. The two methods are compatible; the Mom Test gives you the rules for not contaminating the answer, JTBD gives you the structure for what to ask about.

02 · Replace opinion prompts with behavior prompts

Opinion prompts are the bait. "What do you think of this?" "How important is X to you?" "Would you use this if it existed?" Every one of those returns a number or a sentiment that is not connected to any past or future action. The team will quote it in a deck. The deck will mislead.

Behavior prompts return actions. "What did you actually do?" "How much did you pay for that?" "Who did you ask for help?" "What did you switch to after?" Each of those is verifiable. The participant could be lying, but the lie is detectable because behavior leaves traces (calendar entries, bank statements, support tickets, other people who can be asked).

A practical substitution table the team can keep next to the interview script:

  • Opinion: "Do you like X?" → Behavior: "When was the last time you used X? What for?"
  • Opinion: "Would you pay for X?" → Behavior: "What are you paying for today to solve this? How much?"
  • Opinion: "Is X important to you?" → Behavior: "Walk me through the last time X mattered. What did you do about it?"
  • Opinion: "What features would you want?" → Behavior: "What did you have to work around the last time you tried this?"

The behavior version is harder for the participant, because it requires recall. That is the point. The friction filters out the polite answer.

03 · Talk less, probe what surprised you

The Mom Test's third rule is that the customer should be talking roughly 80% of the time. Most untrained interviewers talk 50% of the time, and most of that talk is either pitching the product, validating the answer, or filling silence. Each of those degrades the data: pitching primes the participant to be polite, validating signals which answers are "correct", and filling silence prevents the participant from finishing the thought they were building toward.

The discipline is to ask a short open question and wait. If the participant pauses for five seconds, do not rescue them; the pause is usually where the more honest answer lives. If the answer surprises you (the customer says they paid for something you did not expect, used a workaround you have never seen, or had a constraint you did not consider), probe that. "Tell me more about that." "What made you do it that way?" "When did that start?"

A good operational test: if the team is reviewing the recording and the team-side voice talks more than 25% of the time, the interview is mostly noise. The fix is in the next interview, not in re-reading the transcript.

04 · Press for commitment, not compliment

Fitzpatrick's most quoted line is that compliments are how customers lie politely and commitments are how customers tell the truth. The end of a Mom Test interview is not "did they like it?" It is "did they advance?". Did they introduce you to two more people who have the same problem? Did they put a calendar hold on a follow-up? Did they pay a deposit? Did they let you watch them use the current workaround? Each of those is a small, real commitment of time, reputation, or money. None of them require the participant to be polite.

If the customer will not commit, that is the data. A customer who said the idea was great and then declined a fifteen-minute follow-up is a customer who does not have the problem badly enough to do anything about it. The team should weight the recorded commitment, not the recorded enthusiasm. The general operational pattern is covered in continuous discovery interviews; the Mom Test discipline is what makes the recorded commitment trustworthy.

05 · Take notes that capture the falsifiable claim

A Mom Test interview that produces a transcript full of compliments and zero recorded commitments is a failed interview, and the notes should record that failure rather than launder it. The note structure we use, by participant:

  • What they actually do today. The current workflow, with the specific tools, the specific cadence, the specific time cost.
  • What they tried before. The previous tool, the workaround, the failed switch.
  • What it cost them. Time, money, reputational risk, missed deadlines.
  • What they committed to. Calendar hold, intro, deposit, prototype access. Or: declined.
  • What surprised the team. The detail that contradicts the team's prior model of the customer.

Each line is falsifiable in a later round. A future interview can ask the same questions and either confirm or update the pattern. The note structure invites the team to record what they observed rather than what they hoped, which is the part the Mom Test is trying to enforce.

The full thematic-coding pass we use after a round of interviews is covered in how to analyze user interview transcripts. The Mom Test note structure feeds that coding pass cleanly because every line is already anchored to an observation rather than a sentiment.

06 · Run the same script async, with adaptive probing

The Mom Test was written for live interviews, and the live version is still the gold standard for a single deep conversation. The constraint is that most teams cannot run twenty live interviews in a quarter, and the interviews they do run are biased toward the customers willing to take a calendar slot, which is often not the segment they need to hear from. Async is what unlocks the volume.

The async version of a Mom Test interview is a five-to-seven-prompt study link that the participant answers in voice, text, choice, or rating on their own time. Each prompt is one of the past-anchored behavioral questions from step one. The participant gets the same set of prompts every other participant got, which fixes the structural-variance problem that live interviews suffer from when different interviewers ad-lib.

Where the async version is at risk of failing the Mom Test is exactly where the live version fails: when the first answer is generic and there is no interviewer to probe it. The fix is configurable adaptive probing per question. Shallow depth on the rating prompts (one clarifier at most, because over-probing erodes the response rate on quant items). Medium depth on the behavioral "walk me through" prompts (a small chain of clarifiers when the answer stays vague or contradicts itself). Expert depth on the commitment prompts ("what did you do next? what did that cost? what did you try first?"), where the AI keeps probing until it has the level of context a trained human interviewer would extract. The participant retains the right to skip on every probe, which keeps the study honest about the cost it imposes. Choice and Info question types do not trigger probes; voice, text, and rating do.

"Honestly, I said I'd use it but, like... [pause]... I've been paying for two other tools that do the same thing and I haven't opened either of them in a month. So probably I wouldn't."

Participant · #5821 · async Mom Test study, medium-depth probe

The probe in the transcript above is what catches the lie. The first answer was a compliment ("I'd use it"). The probe asked what the participant was currently doing about the problem. The honest answer (two tools, both abandoned) is the data. Without the probe, the team would have walked away with a false positive.

Why the Mom Test gets harder, not easier, with async research

A common assumption is that async research is less susceptible to the Mom Test failure modes because the social pressure to be polite is lower. The participant is alone, typing or speaking into a phone, not face-to-face with a founder visibly invested in the answer. There is some truth to this, and response-modality studies (the longer treatment is in what we hear when we stop asking people to write) show that voice responses in particular catch hesitation that a typed response would have edited out.

The harder problem is the opposite: async removes the live interviewer's ability to redirect a generic answer in flight. When a participant in a live call says "I usually just..." the interviewer can interrupt and say "specifically the last time." When a participant in an async study says the same thing, there is no interrupter unless one has been designed into the system. Without that interrupter, the cleaned-up generic answer is what gets recorded, and the synthesis pass treats it as data.

This is why adaptive probing depth is a Mom Test instrument, not just a UX feature. The depth setting is the team deciding, per question, how much probing pressure the AI is allowed to apply on the participant's behalf. Default-on probing for voice, text, and rating questions in a continuous-feedback context is how the async version maintains the Mom Test discipline at scale. The general operational shape is covered in how to run voice user interviews.

What the Mom Test does not cover

The Mom Test is a discipline for not getting fooled by polite participants. It is not a complete user research methodology and it does not try to be. Three gaps worth naming so the framework is not asked to do work it was never designed for.

Quantitative validation. The Mom Test tells you which behaviors are real. It does not tell you how many people have them, what segment they fall into, or how the volumes compare. A Mom Test pass is the qualitative half of validation. The quantitative half is usage data, conversion-funnel analysis, or a sized survey (a product-market fit survey on a real cohort is a reasonable companion).

Concept evaluation. The Mom Test deliberately avoids testing concepts. It assumes the right move at the discovery stage is to characterize the problem and the existing workaround, not to evaluate a proposed solution. Once a prototype exists and the team needs to test whether the solution lands, a different method takes over: see how to run concept testing for the operational pattern.

Recruitment. The Mom Test assumes you are talking to the right participants. If the recruit is wrong (friendly customers, your network, polite respondents), the interview cannot save you. The recruitment side of the discipline is covered separately in how to recruit user research participants.

The Mom Test is one rigorous half of a complete research practice. Asked to do more than that, the framework breaks. Asked to do exactly that, it is the most efficient false-positive filter the discipline has produced.

Keeping the Mom Test alive after the round

The most common cause of Mom Test discipline decaying inside a team is treating it as a methodology a researcher applies during a study and forgets afterward. A more durable approach is to bake the discipline into the standing instrument the team uses to collect feedback, so every new response gets filtered through the same rules whether or not anyone is actively running a study.

In our practice this looks like a single Talkful study link that lives in three continuous-feedback placements: an in-product feedback surface (settings menu or a contextual "what's missing here?" affordance), a churn-flow capture (the cancel confirmation page or the offboarding email), and a post-onboarding pulse (day-7 retention check). Every prompt in the study is past-anchored and behavioral by construction. Every voice or text answer is probed adaptively when the first answer is vague. Every recorded commitment is logged. The same synthesis engine that processes the responses surfaces the patterns and the team reviews them on a fixed cadence. The Mom Test is no longer something a researcher does in a quarter; it is the shape of the standing instrument the team listens through, ready for them to ship from or for the agents they are building to act on.

FAQ

What is the Mom Test?

The Mom Test is a set of three rules for customer interviews developed by Rob Fitzpatrick and published in 2013 (with a revised edition in 2026). The rules are: talk about their life instead of your idea, ask about specifics in the past instead of opinions about the future, and talk less and listen more. The framework is named after the observation that even your mother will lie politely to protect your feelings if you ask her whether your business idea is any good, so the interview design has to make politeness an unreliable answer rather than the default one.

What are the three rules of the Mom Test?

Rule one is to anchor every conversation in the participant's life and experience rather than in your idea or product, so the participant has nothing polite to react to. Rule two is to ask about specific past episodes rather than hypothetical futures or generic opinions, because past behavior is verifiable and future intentions are not. Rule three is to talk roughly 20% of the time and let the participant talk roughly 80%, because the pauses and the unprompted detail are where the honest answers live. Applied together, the rules produce interview transcripts that resist the false-positive failure mode most discovery interviews suffer from.

What are the three types of bad data the Mom Test detects?

Compliments ("this is great", "I love it"), hypothetical fluff ("I would definitely use that", "if you built X I'd switch"), and wishlists ("the killer feature would be SSO"). Each of these sounds like signal in a meeting room and predicts nothing about the participant's actual behavior. The Mom Test recovery for each is the same shape: redirect to a specific past episode and ask what the participant actually did about the problem the last time it came up. The behavior version of the answer is the data; the original compliment, hypothetical, or wishlist is the noise.

How does the Mom Test apply to async or AI-moderated interviews?

The three rules carry over directly because they are modality-agnostic: they govern the shape of the prompts and the discipline of the listening, not the medium of the conversation. The harder part of async is that there is no live interviewer to redirect a generic first answer in flight, so the probing has to be designed into the system. Configurable adaptive probing depth, set per question (shallow for rating prompts, medium for behavioral prompts, expert for commitment prompts), is how an async Mom Test interview maintains the same standard as a live one. The participant retains the right to skip on every probe, which keeps the study honest about the cost it imposes.

How is the Mom Test different from jobs to be done interviews?

The Mom Test is a discipline for not contaminating the answer. Jobs to be done is a framework for what to ask about: specifically, the switch event from one solution to another and the four forces (push, pull, anxiety, habit) that made the switch happen on a specific day. The two are complementary. A jobs to be done interview that ignores Mom Test rules produces compliments about the participant's switch story; a Mom Test interview without a JTBD-style anchor produces clean transcripts about behavior that may or may not predict a switch. Run together (Mom Test rules over JTBD-shaped prompts), they produce the highest signal-to-noise the qualitative half of validation can offer. The full treatment of the JTBD frame is in how to run jobs to be done interviews.

Does the Mom Test work for B2B customer interviews?

It works better in B2B than in B2C, because B2B participants have stronger incentives to be polite (your contact is also a buyer, an evaluator, or a colleague at a partner firm) and the cost of a false positive is higher (longer sales cycles, larger committed dollars). The rules apply identically, with two adjustments. The "talk about their life" anchor becomes "walk me through your team's workflow" so the participant is reporting behavior they observe rather than only behavior they performed personally. The commitment ask in step four becomes a multi-stakeholder ask: an intro to the budget holder, a calendar slot with the user, a security review document. Each is a real commitment of organizational time, and each is harder for a polite participant to fake than a generic compliment.


A Mom Test interview is, in the end, an interview the team can argue from. The transcripts contain past behavior, recorded constraints, and observable commitments. The compliments have been redirected, the hypotheticals have been anchored, and the wishlists have been unpacked into the workflow problems underneath them. The async version, run through a configurable AI interviewer with adaptive probing on behavioral and commitment prompts, holds the same discipline at the volumes a continuous-feedback practice requires. Talkful is built around that shape: a single study link that lives in the places where the team is most likely to catch real behavior (in-product feedback, churn flows, post-onboarding pulses, owned distribution), with smart follow-ups that probe the polite first answer into the honest one, and a synthesis engine that streams themes, quotes, and participant-attributed citations back as the responses land. The wider voice user research guide covers where the Mom Test sits inside that continuous practice.