The trouble with Internet surveys

Gary Langer, the director of polling at ABC News, shares the bad news regarding Internet surveys.

In the most extensive such analysis to date, David Yeager and Prof. Jon Krosnick compared seven non-random internet surveys with two others based instead on random or so-called probability samples. The non-probability internet surveys were less accurate, and customary adjustments did not uniformly improve them.

While the random-sample surveys were “consistently highly accurate,” the internet surveys based on self-selected or “opt-in” panels “were always less accurate, on average, than probability sample surveys, and were less consistent in their level of accuracy,” the researchers said. Further, they said, adjusting these samples to known population values had no effect on accuracy (and in one case even worsened it) as often as that process, known as weighting, improved it.

Also noteworthy:

While this paper is the first to evaluate the subject in such detail, intimations of these problems were posted in a blog item this summer by Reg Baker, COO of the research firm Market Strategies International. Estimates of smoking prevalence were similar in three probability samples, he reported, but less similar – with variation of as many as 14 points – in 17 opt-in online panels. In such panels, he said, “the results we get for any given study are highly dependent (and mostly unpredictable) on the panel we use. This is not good news.”

Yeager and Krosnick, meanwhile, provide one more eye-opener: The average highest weight for any one respondent across the opt-in online samples was 30 – one respondent, that is, standing for the equivalent of 30 in the full dataset. (And one went as high as 70.) The highest weights in the two probability samples, by contrast, were 5 and 8.

Nothing new or groundbreaking here, and yes, a little inside baseball, but relevant in the light of all of these web surveys showing that “Teens don’t tweet.”  First, convenience-sampled web surveys can’t offer standard errors, and the weighting process that produces errors is highly susceptible to inflation in areas where data are sparse.  This sparseness commonly occurs when studying the behavior of a low-response population such as young people, and is multiplied when studying an early-adopting phenomenon like Tweeting.

Langer’s blog is a worthwhile resource if you’re interested in survey methods.  And I hope to resume blogging – updating my syllabus, posting some recent papers, etc. – when I get a spare moment.

via Study Finds Trouble for Internet Surveys – The Numbers.

Tags:

One comment

  1. The relationship of these findings with “the Internet” is tenuous. One could say the same things about most surveys that have a self-selected and inherently biased sample.

    I know it’s easy to dismiss some of the standards for academic research as mere snobbery but many of those standards exist for good reasons. Much “marketing research” would never be accepted by academics not (just) because we’re snobs but because it’s simply not very meaningful or useful.

Leave a comment