In mid-October, a Gallup poll of likely voters nationwide showed former Massachusetts Gov. Mitt Romney leading President Barack Obama by a 7 percent margin. That same week, a poll by the University of Connecticut and the Hartford Courant, covering virtually the same time period, showed Obama ahead of Romney by 3 points. That’s a 10-percentage-point disparity.
The Gallup poll reported a margin of error of plus or minus 2 percent, while the UConn/Hartford Courant poll reported a 3 percent margin of error — so even if you add the maximum claimed errors, that still leaves a 5-point disparity between the results. As some candidates have observed on the campaign trail, the math just doesn’t seem to add up.
But such disparities, in this election season of rapidly shifting tides, have not been all that unusual. So what explains them?
There may be several factors at work. For starters, the concept of “margin of error” is a bit more complex than the numbers usually quoted in media coverage.
What pollsters usually mean by margin of error is something more specific, called the margin of sampling error. That’s a statistical measure of how much a result based on interviews with a limited number of voters — the typical sample size is about 1,000 to 2,000 people — differs from the result that you would get if you were able to interview all the likely voters in the country.
“A sample of 1,000 or 1,500 people gives a reasonable estimate,” says Adam Berinsky, a professor of political science at MIT and editor of “New Directions in Public Opinion” (Routledge, 2011), a book about research on public opinion.
Most of the time, studies have shown, talking to 1,000 people gives a result very close to what you would get by polling the whole country. Extrapolating to a nation of 132 million voters — roughly the number of citizens who voted in the 2008 presidential election — gives a 1,000-person poll a margin of sampling error of 3.1 percent. Double the sample size, to 2,000 people, and the margin of sampling error falls to about 2.2 percent. But the improvement diminishes rapidly: A poll of 5,000 people gives about a 1.4 percent margin, and it takes a whopping 10,000-person sample to get the margin down to 1 percent.
One out of 20
What does that margin of error figure actually mean? The definition is that 95 percent of the time, the sampled result should fall within that margin of the result you’d get by sampling everybody. But that also means that one time out of 20, the results would fall outside of that range — even if sampling error were the only source of discrepancies.
But it’s more complicated than that, because sampling error is not the only thing that can throw off poll results.
Another potential source of error, and one that’s hard to quantify, is the nonresponse error. Pollsters begin by attempting to reach a certain randomly selected set of people that is representative of the overall population — for example, by generating a list of random phone numbers. But there are two problems: Sometimes nobody answers the phone, and even when someone does answer, they often — and increasingly — refuse to respond.
The Pew Center for People and the Press, for example, says that its response rate has plummeted in the last 15 years: Their total response rate to polls, which was 36 percent in 1997, is down to just 9 percent this year. “This is very worrisome to pollsters,” Berinsky says.
Most nonresponders are people who answer the phone, but refuse to take the poll. This year, Pew says, 62 percent of people called by their pollsters answered the phone, but only 14 percent of those would answer questions.
Who doesn’t answer?
A big question is whether the people who don’t answer are different in any meaningful way from those who do: Are the nonresponders more liberal, or more conservative, than those who do respond? If so, that could skew a poll’s results.
Pew has made a serious effort to assess the possible impact of nonresponse error on its poll results: For one sample, the organization made a concerted effort to follow up with as many nonresponders as possible, asking questions designed to see if they differed from those who had answered the first time. While Pew found few significant differences between poll responders and nonresponders, there were some: Those who answered, it turns out, were much more likely to volunteer for charitable organizations, attend a church, and contact their elected officials.
A common mistake in the reporting of poll results is the application of the margin of sampling error for the entire poll to various subsets of the population: women, men, Democrats, Republicans, independents. But each of these subgroups is smaller than the total group, so the margin of error is actually much greater. “A lot of journalists don’t understand that,” Berinsky says.
Another potential problem is measurement error, which occurs when a poll’s questions are poorly worded, or prefaced by information that biases the responses. The major polling organizations take great care to avoid measurement error, but polls commissioned by partisan organizations sometimes suffer from such errors.
Overall, Berinsky counsels, the best strategy is not to focus on any particular poll, but to look at a rigorous aggregation of poll results, such as those conducted by Pollster.com or Real Clear Politics. Such averages smooth out the variations and errors that may exist in any given poll or sample. In the 2008 election, he says, “a simple average pretty much gave you the [actual] result.”
“If you average it all together,” Berinsky adds, “it all works out.”