Michael Bailey is a professor of American government at Georgetown University and the author of Polling at a Crossroads: Rethinking Modern Survey Research.
By enabling citizens to absorb and respond to one another’s views and interests, public opinion polling provides an essential platform for democratic deliberation. But in 2024, polls are even more important because they can legitimate an election outcome that many people are being encouraged – particularly by Donald Trump and his allies – to view with skepticism. In the best-case scenario, polling will muddle through; at worst, it will misfire spectacularly, inflaming an already volatile situation.
In their ideal form, polls provide remarkably good information. If pollsters query a random sample of voters and, crucially, they all answer, the results will be unbiased, and calculating the margin of error will be relatively straightforward. But in a world where many people no longer answer a phone call or respond to a text from an unknown number, random sampling has become a pipe dream. For example, the New York Times/Siena College poll of battleground states that contributed to President Joe Biden’s withdrawal from the race in July had 4,097 respondents – a mere one per cent of the 410,000 randomly selected people that pollsters called. (The response rate was lower among hard-to-reach – but vitally important – demographic groups, such as young people, Hispanics, and people without college degrees.)
While probability-based pollsters reluctantly work with non-random samples, others have abandoned random sampling altogether, and instead use non-probability techniques – namely, online opt-in polls. Never mind that seeking respondents on the internet has been shown to introduce bias and produce misleading results.
Given the prevalence of non-random samples, both types of pollsters rely heavily on a wide range of adjustments. Chief among them is the use of weights to counteract the biases in their (often wildly unrepresentative) raw data. For example, the share of young people responding to surveys commonly falls far short of their share of the overall population, and – ominously for political polling – there are often too few Republicans in a sample. But who is to say if the adjustment is correct?
Polling must be rebuilt from the ground up. The first problem is that pollsters still think in terms of random samples and try to recreate them using statistical adjustments. But accepting that polls are non-random is a more realistic starting point. The theory of non-random sampling is well-developed and has provided important insights into survey research, including that small non-response errors within the demographic groups used for weights can significantly alter results.
To restore confidence in polling requires diagnosing non-response bias more accurately. Progress has been made on developing the tools to do so. But, in the meantime, one way to recognize the problem is to compare a poll with the typical respondents (the one per cent discussed above) to one that draws in people who are less willing to participate in survey research.
Suppose that two polls for the 2024 presidential election are conducted simultaneously. One is business as usual, with a one-per-cent response rate. The other, for which people are paid to respond, has a 30-per-cent response rate. If the two groups support Kamala Harris at the same rate, then there is no sign that the willingness to respond to polls is related to the content of responses. If they differ, however, this is diagnostic evidence that the people who are less eager to respond to polls differ from the typical respondents.
There was some evidence of this before the 2020 election, suggesting that polls would understate support for Mr. Trump. By contrast, in 2022, The New York Times and Siena College ran a poll experiment in which they conducted two simultaneous surveys: one using standard methods (yielding a one-per-cent response rate) and one in which they paid people US$30 to respond (yielding a 30-per-cent response rate). They saw no differences between the paid and unpaid responses, which suggested that their conventionally weighted survey results would be accurate – as, in fact, they were.
The current state of political polling bears a strong resemblance to Mary Shelley’s Frankenstein: random sampling is dead, and our mad scientists are using weighting and modeling to jolt it back to life. They have unleashed this monster before, making the wrong calls for the U.S. presidential elections of 2016 and 2020. Pollsters were more accurate for the midterm election cycles in 2018 and 2022. With the most recent New York Times/Siena College poll of the 2024 election finding Ms. Harris and Mr. Trump locked in a dead heat, one can hope that the polling monster causes minimal damage this year. Whatever the outcome, the goal must be to keep it caged for good.
Copyright: Project Syndicate, 2024. www.project-syndicate.org