[Insights]

The Problem with the Data Quality Problem

A candid look at why our industry keeps shooting itself in the foot

Houston, We Have a Problem

Let’s face it: market researchers are in a long-term, complicated relationship with survey respondents—and it’s far from healthy. How so, you ask? Because researchers are constantly criticizing the quality of the data that respondents provide. We complain about it at conferences, write endless white papers on it, and hold panel discussions where we all nod sagely about “the crisis in our industry.” Yet somehow, we keep fielding 45-minute surveys that ask respondents to rank 27 product attributes on a 7-point scale … and then wonder why the data looks suspicious … time and time again.

The truth is, it’s not that respondents inherently provide bad data; it’s that we’ve designed surveys so exhausting, unengaging, and even infuriating that panelists can’t help but deliver subpar responses. We’re essentially engaging in gaslighting on a monumental scale – telling ourselves, our clients, and our industry partners that data quality issues arise from “bad respondents,” when in reality, the problem begins with us.

It’s like complaining about your weight while eating your third donut of the day. We know what the problem is — we’re just unwilling to put down the metaphorical pastry.

Stop Blaming the Sample Providers

The first thing we need to understand is that our data quality issues aren’t happening because of sample providers — they’re happening despite everything these providers do to keep respondents engaged.

Sample providers are the unsung heroes of our industry, frantically trying to keep their panels alive while we subject them to the survey equivalent of water torture. They’ve implemented:

  • Sophisticated fraud detection systems
  • Multiple identity verification steps
  • Engagement algorithms and gamification
  • Regular panel maintenance and cleaning

 
Yet we still blame them when respondents start straight-lining through our 20-grid questionnaires or fail to deliver completes in the impossible time frames we’ve demanded.

Our Outdated Questionnaires Are the Real Culprits

Let’s talk about the elephant in the room: our questionnaires are boring. Not just regular boring — they’re “watching paint dry while listening to elevator music” boring.

In an era where TikTok videos have trained consumers to expect entertainment in 15-second bursts, we’re asking people to concentrate on a poorly designed survey for more than 20 minutes. We’re competing for attention in a universe engineered to trigger dopamine on demand. Reed Hastings once said Netflix’s biggest competitor is sleep. Meanwhile, we’re showing up with a survey that reads like it was designed for a 1970s phone interview. We expect engagement, yet we design for abandonment.

Speaking of the 70’s, some of you will remember when TV only had three channels and people would watch whatever was on. That’s the take-it-or-leave it mindset we’re designing surveys with, and spoiler alert, most quality respondents are “leaving it.” Although we live in a world of infinite choice, we resist changing our approach for fear of “breaking the normative data” or because “that’s how we’ve always done it.”

The Race to the Bottom: The Great CPI Collapse

During the early aughts, general population samples cost $10-15 per completed interview. Market researchers budgeted accordingly, and sample providers could build sustainable business models around quality.

Fast forward to today, and we’re paying sub-$1 CPIs for the same respondents. That’s more than a 90% reduction in cost! But did the cost of acquiring and maintaining those respondents fall by 90%, too? Of course not.

This pricing pressure has created a race to the bottom where:

  • Proper panel maintenance becomes unprofitable
  • Sample providers must over-survey their panels to remain viable
  • Quality becomes an unaffordable luxury rather than a standard

 
It’s like expecting a five-star hotel experience at a roadside motel price. Something’s got to give, and that something is quality.

We’re Attracting Exactly Who Our Incentives Appeal To

Here’s a thought experiment: Would you, a market research executive, complete a 20-minute survey about laundry detergent for $1? What about $0.50? Probably not, unless you were particularly passionate about stain removal technologies.

Yet that’s exactly what we expect from our respondents. The math simply doesn’t work — if someone can make more money in 20 minutes from listing their never used bread machine on eBay than taking a survey, we’re effectively selecting for people who either:

  1. Show little regard for their time
  2. Have figured out how to game the system to complete surveys quickly (either directly or as the controller of robotic minions who do the work for them)
  3. Are genuinely interested in the topic (a tiny minority)

 
We’re literally incentivizing bad survey-taking behavior by under-compensating respondents for their time and attention. It’s like being shocked by a leaky roof after hiring a contractor who charged 80% below market — what exactly did we expect?

The Synthetic Sample Siren Song

Just when you thought our industry couldn’t find more creative ways to compromise data quality, enter synthetic data. It’s the latest “solution” to our sample cost and availability challenges, and although I believe there is a time and place for the use of it, when misused it can also be as reliable as a weather forecast from your horoscope.

Facing pressure to deliver impossibly niche audiences at rock-bottom prices? No problem! Just supplement with some synthetic respondents who conveniently provide the exact data patterns you need. Can’t find enough left-handed viola players who also own electric vehicles? Don’t worry — we can generate those responses for you!

The allure is obvious: why pay $20-$30 per interview for a truly rare audience when you can pay $1 for synthetic data that looks statistically similar? It’s the market research equivalent of filling your designer handbag with newspaper — it maintains the shape, but there’s nothing of substance inside.

The problem is that synthetic data, by definition, can only replicate patterns we already know. It can’t reveal genuine surprises, contradictions, or the messy human inconsistencies that often lead to breakthrough insights. When we use synthetic data to complete hard-to-reach segments, we’re not researching reality — we’re researching our assumptions about reality.

And let’s be honest about why we’re turning to synthetic data: it’s not because we can’t find these respondents. It’s because we don’t want to pay what it actually costs to reach them. Instead of admitting that rare audiences command premium prices, we’re creating artificial replacements that tell us what we think they should say.

It’s a dangerous path that, when used improperly, can undermine the very purpose of market research, which is to understand real people and their actual behaviors.

The Change We Need (But Probably Won’t Implement)

Unless our business models, and approach to engaging respondents, fundamentally change, we will continue to watch data quality deteriorate while writing more articles about how concerning it is. The solutions are clear but painful:

  1. Accept that quality data costs money — budget for quality rather than quantity
  2. Radically shorten surveys — cutting your 25-minute survey to 5 minutes might yield better insights
  3. Design for the modern attention economy — engage respondents with better user experiences
  4. Be the change agent your clients deserve — whether your clients are internal or external, it’s up to each one of us to set the example

 
Until brands and agencies are willing to sacrifice survey length or change how we ask questions, we’ll be stuck in this cycle. We need to decide what’s more important: having lots of cheap, questionable data OR higher quality opinions that can actually meaningfully shape decisions.

Admitting We Have a Problem

The first step to recovery is admitting you have a problem. The second step is doing something about it.

As an industry, we’re excellent at step one — we’ve been admitting we have a data quality problem for over a decade, however, we’ve been casting blame rather than accepting responsibility. We’re terrible at step two because it requires sacrifice, change, reorienting client expectations, and potentially making less money in the short term.

So, the next time you find yourself in a meeting discussing data quality concerns, ask yourself: “Am I willing to guide my client to adopt a more impactful approach or am I just hoping to find a magical solution that allows me to keep doing everything exactly the same way?” If you’re in the “magical solution” camp, the recent indictment of senior leaders at Op4G and Slice should serve as a cautionary tale.

Because the problem with the data quality problem isn’t that we don’t know how to fix it — it’s that fixing it requires us to change. And change, as we all know, is the one thing market researchers resist more than a 30% increase in CPI.

Are you ready to be part of the solution? Start by cutting your next survey in half and writing questions that don’t sound like you’re talking to a robot. Your respondents (and your data quality) will thank you.