When Generalizability Breaks: Understanding Data Quality in Market Research
- Renato Silvestre

- Apr 20
- 5 min read
Updated: May 7

The Duck Test No Longer Works
In a city that never sleeps, even a moment can make a statement. STRATEGENCE was in Times Square in collaboration with Verisoul, raising a question that has become harder for the research industry to ignore:
If it looks like a respondent and clicks like a respondent, is it actually a respondent?
Our billboard may have only been up for a moment, but the issue it represents is not new. We've been focused on sampling and generalizability since the early days of online research, long before data quality became the headline.
The problem is not only that bad actors enter surveys. The deeper problem is that bad data breaks inference. Once the respondent pool is biased, duplicated, inattentive, synthetic, or opaque, the study may still produce numbers, but those numbers may no longer generalize to the population the research claims to represent. That is where market research risk begins.
A quick rewind
From 2000 to 2004
It started during my time at Millward Brown Interactive in San Francisco in the early 2000s. Even then, many resisted, questioning the biases inherent in online research.
Survey Sampling International (SSI) was beginning to transition telephone-based sampling into online panels, and they weren't even the dominant player. Feasibility was always the question with their SurveySpot panel.
E-Rewards came knocking, but questions arose immediately about how representative the panel really was. Recruiting through loyalty programs likely skewed the sample toward frequent travelers and higher-income business professionals. In hindsight, the better question might have been, "How do I invest?" In my view, the company became one of the best-managed panel providers in the industry. E-Rewards controlled recruitment through email invitations and enforced clear participation rules.
Global Market Insite (GMI) offered another lesson. As a market research practitioner, I made it a point to join panels to understand how they operated from the respondent side. In one instance, I forgot my password and could not get back in. After a few reset attempts and even a new-account attempt, I found myself locked out, effectively blacklisted from rejoining or reactivating. Looking back, that level of control feels almost refreshing. It reflects a kind of discipline that feels less common in today’s convenience-driven sample ecosystem.
Greenfield Online marked another milestone. The company filed for an IPO in 2000, shelved the plan, and eventually went public in 2004. At the time, the online sample market was becoming scalable, investable, and commercially important. That shift was exciting, but it also exposed a question the industry still faces: What happens to research quality when the sample becomes purely a scale-to-exit business?
From 2005 to 2012
SSI and E-Rewards became two of my primary sample sources. SSI was often the go-to for consumer sample, while E-Rewards was especially useful for B2B audiences.
Early Quality Metrics During this period, we began to see early signs that data quality challenges were becoming more systemic. In one large engagement from 2009 to 2013, we conducted research with more than 100,000 respondents across multiple panels. Because we were able to collect names and email addresses, we could observe patterns that are typically hidden from researchers.
We saw about a 3% overlap across the major panels. Roughly 5% of respondents made careless errors, such as entering an email incorrectly, misspelling it, or placing it information in the wrong field. More notably, around 7% to 10% appeared to be fraudulent, manually attempting to participate more than once. Those were the good old days. The fraud was often clumsy, manual, and easier to see.
Mobile Usage
In 2012, about 10-12% of respondents were accessing surveys on their phones. We monitored this group closely and, in some cases, restricted access, especially when concept images were involved. The devices at the time were nowhere near what they are today, and they introduced a real layer of response bias tied to screen size and form factor. That made survey design and look-and-feel even more critical. By 2017, the share of respondents coming in via smartphone had nearly tripled.
From 2013 to 2019
We conducted our own internal research on research (RoR), studying sampling, survey design, and data quality across both traditional panels and programmatic samples.
Overall Topline Learnings
The headline findings in our 2017 study were sobering:
Regardless of sample type, 15% of respondents exerted questionable effort, failing multiple trap questions. The problem wasn't unique to any one source. It was systemic.
Age mattered more than expected. About 30% of 18 to 34-year-olds showed questionable effort.
Respondents aged 55 to 70 had the lowest failure rates and took three to four minutes longer to complete the same survey. All things considered, older respondents simply seemed more engaged as suggested by the quality of their data.
Traditional Non-programmatic vs. Programmatic Sample
The programmatic sample presented its own challenges.
Traditional, non-programmatic panel respondents were twice as likely to complete a recontact survey.
Programmatic fielding was erratic. In one instance, 600 completed interviews came in within 20 minutes, followed by sporadic data collected for the remainder of the field period.
This uncontrolled velocity suggested that once a survey was inserted in the routing system, sources had little control over the sampling.
Survey Design & Implementation
We tested a three-flag QC approach: speed, question traps, and human review for duplicate IPs and verbatim responses. We learned...
Depending on the speed threshold applied, between 16% and 28% of respondents were flagged across all three methods.
The standard threshold at the time was catching only the most egregious offenders. The data made a clear case for tightening controls and monitoring systems in survey design and throughout the interview flow.
One finding that tends to get overlooked: questionable effort was highest on Day 1 of fielding and declined over time. The respondents who rushed in first were the most problematic.
These insights continue to shape how we build sample frames, design surveys, and manage fieldwork.
The Landscape Today
What we were seeing back then was a warning sign. The shift from managed panels to programmatic and exchange-based sampling has created a fragmented ecosystem that is, in many cases, opaque. Efficiency improved while control and transparency declined. That tradeoff happened quietly, at scale, and the industry is still dealing with the consequences.
The "get paid to" affiliate model only accelerated it. When respondents are recruited through cash or reward-based offers on third-party sites and apps, the primary incentive is earning, not answering. Traffic gets routed into surveys with little regard for intent or identity.
Panel consolidation added another layer. What was once a competitive landscape with distinct approaches has narrowed into a handful of scaled players. SSI and Research Now merged into Dynata in 2017. By 2024, Dynata was restructuring under bankruptcy pressure. Opinions 4 Good, once a known panel provider, saw its executives federally indicted in 2025 for large-scale survey fraud. These are not isolated events. They are signals of a system that prioritized throughput over integrity for too long.
The numbers reflect it. Without robust detection systems, up to 35% of collected data can be biased, duplicated, or fabricated. Compare that to the 7-10% manual fraud we were seeing in the early 2010s, when this was still done by hand. Technology didn't solve the fraud problem. It scaled it.
At SampleCon, an annual sample industry conference, a speaker tried to map the current provider ecosystem on a single slide. It was so fragmented that even the organizers couldn't agree on how to categorize the players. That image says more about the state of the industry than any white paper.
The billboard in Times Square asks a simple question using a familiar line. If it looks like a duck and quacks like a duck, it's probably a bot in your survey. That line works because it used to be obvious. Today, it takes a lot more than common sense to tell the difference, and that's exactly why working with the right partner and the right systems matters more than ever.




Comments