There is one group of prediction professionals with far more experience than political pollsters: the weather forecasting community. No one else has more day to day experience and success in using large amounts of data and in predicting the future.
Thus, the natural question is: can the insights of the highly experienced and technically sophisticated weather prediction community assist our embattled colleagues in political polling and forecasting? Perhaps.
As an example of how the political polls were challenged, below is a plot of the probabilities of Clinton winning over time. Most of the polls and combination of polls during the last week were giving her 80-95% chance of winning, with the exception of the 538 multi-poll guidance, which gave her only about a 65% change of winning (hats off to Nate Silver). This was clearly a suggestion of considerable uncertainty (I certainly wouldn't get on a plane whose chances of crashing was 35%!).
Many, if not most, of the polls made their biggest mistakes in the Midwest and Pennsylvania, where they substantially underestimated Trump's strength.
Weather forecasting versus political polling
Weather prediction and political polling have some similarities and differences.
Both attempt to use limited real-world data to determine the "truth" about the current situation. Meteorologists use a range of observational assets (e.g., satellites, surface observations, aircraft data) to determine the current three-dimensional characteristics (e.g., temperature, humidity, winds) of the atmosphere. Pollsters' "observations" are the current opinions of the voting population, predominantly determined through telephone polling.
Meteorologists forecast the future by using complex equations that describe the evolution of the atmosphere if they are provided an accurate description of the current atmospheric state. Pollsters really can't predict the future but try to accurately estimate the current opinions of potential voters and elucidate short-term trends.
Both meteorologists and pollsters are heavy users of statistics. Meteorologists use statistics to assist in filling in the gaps between observations and to compensate for biases in the atmospheric models. Pollsters use statistics to combine and filter the polling information they gather and to use a limited sample of voters to provide an accurate picture of the larger voting population.
The Achilles Heals of Political Pollsters
Nearly all modern political prediction is based on one source of information: polling, which attempts to ask a representative sample of folks about their political intentions-- how they plan on voting. But there is a minefield of problems pollsters have to deal with, such as:
- Telephone calling is the main approach to polling, but technological changes are a big issue. More and more folks have moved from landlines to cell phones. Many people have caller id and deliberately don't answer unknown callers. The person who answers a phone may not be a voter. The problems are endless and growing.
- Then there are the sampling problems: the number of calls are relatively small, but they have to provide sufficient information about the actual voter cohort.
- Changing demographics and communication technologies can result in the behavior of previous polling being less relevant to the current election.
- Because of the above and other issues, pollsters really can't provide useful uncertainty information about their polls.
Can the Meteorological Community Aid Political Pollsters?
Both the meteorological and political polling communities deal with a central problem: using incomplete observational information to provide a complete picture of the current situation. For meteorologists, the 3D distribution of all weather variables. For pollsters, the intentions of the folks that will actually go to the polls.
I would suggest that meteorologists are far more sophisticated in estimating the current situation based on observations. Pollsters use only one type of information to estimate the election results: the expressed intentions of potential voters. Meteorologists go one step further: they use the correlations of other type of information to inform their estimates of difference parameters.
Let me give you an example. Meteorologists need to describe the three-dimensional distributions of temperature, wind, humidity, and other parameters to serve as the initial state of weather prediction models. A difficult parameter is humidity (the amount of water vapor). It is critical, but we don't have many observations of it aloft, in contrast to wind and temperature. But weather folks have a powerful new tool to deal with that problem: we have found the correlations between humidity and parameters for which there are a lot more observations (e.g., wind). How we do this is a bit technical, but one approach is to use ensembles of many forecasts and making use of all those forecasts to determine correlations between various parameters and locations.
So by using all kind of parameters at various locations, we can get a much better analysis of a poorly observed parameter, in this case humidity. I use this technology extensively for weather forecasting though something called Ensemble Kalman Filter data assimilation.
So How Can Pollsters Use Weather Technology?
From what I have read, most political pollsters work the same way as meteorologists did in the old days. Considering only one parameter during their analysis of data (also called data assimilation). Pollsters use telephone information (asking people how they will vote) to estimate the voting of the entire voter contingent.
But what if they worked more like meteorologists? Use other information to estimate what everyone cares about: who will win the election. Pollsters could select from a long list of potential "predictors"--or parameters that could be combined statistically to estimate who would win the election. Examples might include:
1. Demographics (ancestry, age, etc.)
2. Unemployment rate (long term, short term, trend)
3. Education level
4. Facebook activity for each candidate
5. Trend in economic activity
6. Crime rate and trend in crime rate.
...and many more.
You get the idea: parameters that can be estimated accurately can be used to predict who will win the election. There are a variety of statistical techniques (e.g., multilinear regression, neutral nets, AI approaches) than can be used to select the most relevant "predictors" and find the relationship to who will win.
Musing about this, I did a search and found that someone already tried something like this and claims he has gotten every presidential election correct for which he made a prediction: Professor Alan Lichtman of American University.
Professor Lichtman uses a series of parameters or "key" to decide the winner and does not depend on polling information at all. I suspect an artificial intelligence (AI) engine like IBM's Watson could have a lot of potential.
A Final Note: Probabilistic Prediction
One lesson both meteorologists and political pollsters have learned is that the only way to forecast is do so probabilistically. All forecasts are uncertain and we can't simply give our users a number: the high will be 65F, or Trump will win by 2%. We need to give the probabilities of various outcomes. But we have a problem: we still need to develop reliable approaches for calculating realistic probabilities, something that will keep all of us busy for the immediate future.
Wouldn't you consider the quality of the data as an important element in making a forecast, whether it's meteorological or political? One of the problems I heard a fair bit from pollsters and essentially amateur speculators is that people are much more savvy in how they provide answers to pollsters, so one could conclude that the quality of the data could be an issue, which makes it difficult to forecast anything. If the starting data is faulty, you'll only going to get an errant forecast as a result.
ReplyDeleteASOS machines have mechanisms to determine if a machine is faulty, and even without knowing, you can tell fairly quickly if one reading in a mesoscale network of automated weather stations might be an anomaly against the rest of the field. Meteorology has statistical analysis that can be employed to determine the reliability of both the initial data and the subsequent forecasts, and I wonder if polling might need to develop its own set of statistical checkpoints?
Cliff Mass weather blog should be renamed Cliff Mass political blog
ReplyDeleteA question and an opinion.
ReplyDeleteWhile weather forecasting has many quantifiable benefits, what purpose does predicting the outcome of an election serve?
Forecasting cannot, by itself, alter the occurrence of weather whereas election polling can influence the outcome.
Thank you for your efforts in writing this blog.
You left out one of the most important factors that all of the major political polling outfits mentioned in their post - mortems. Namely, that fears of social ostracism and/or disapproval from one of society's so - called elites caused many voters to give them incorrect answers, or in some cases no answers at all. We've seen the same dynamic at play in blogs and in the MSM overall, where anyone not toeing the PC - approved MO is immediately called names or have their characters smeared.
ReplyDeleteAs noted above, quality is highly unpredictable when people - unlike measurable weather factors - flat out lie or otherwise mask their true nature, the whole while masquerading as something they are not. The only good political "polls" during this election involved significant tweaking by correct intuitive guesses rather than purely statistical data processing.
ReplyDeleteTo what degree can the same be said for weather forecasting?
As other commenters noted, a big difference is that weather predictions don't motivate measuring instruments to lie or the atmosphere to change its behavior. That said, thanks for the reference to https://en.wikipedia.org/wiki/Ensemble_Kalman_filter as an approach to assimilate the interplay between the model parameters, observational data, and predictions. I agree that political modelers should assimilate a wider range of data and an ensemble of models rather than try to simply estimate the probability a majority of some group will vote one way or the other. All the self-referential stuff makes the analysis messy, but it seems possible in principle to model and measure distrust of pollsters rather than just assuming that measurement errors are random and cancel out.
ReplyDeleteBTW, the mathematical meteorology pioneer Lewis Fry Richardson inspired many of the early efforts to use statistical analysis and mathematical models to understand political phenomena, especially conflict. When I was a grad student in this area long ago, many papers and dissertations started with his differential equations models an arms race or the historical "statistics of deadly quarrels" he gathered. It looks like people are still taking him seriously, e.g. http://www.independent.co.uk/news/world/lewis-fry-richardsons-weather-forecasts-changed-the-world-but-could-his-predictions-of-war-do-the-9679295.html
Regarding "hats off to Nate Silver":
ReplyDeleteI highly recommend his book "The Signal and the Noise: Why So Many Predictions Fail-- But Some Don't"
He even discusses weather forecasting.
The reality is it is extremely difficult to forecast anything with complex and moving variables. Politics, the weather, and climate all fall into that category (so does the stock market, let me know when you have a forecast that works for that).
ReplyDeleteMy biggest concern is the use of forecasting as a tool for political manipulation. This is happening all of the time now, particularly with weather and climate. One glaring example of a forecast used explicitly for political purposes was last fall when the DOE used a winter forecast here of “drought: continue or worsen” to foment fear tied to climate change: “The Department of Ecology's --- hopped on a call this morning to tell reporters that ‘the climate deck may be stacked against us,’ and she painted a pretty dire picture of what's already happened and what we can expect to happen next” (Stranger, 9/24/15).
I probably do not need to remind any readers of this blog that last winter was the wettest on record here.
We need to get politics out of the weather and climate sciences. Completely. Politics utterly corrupt the unvarnished truth of real science.
I forgot to add that the reliability of climate forecasting is positioned somewhere between weather and the others, due largely to quality of process, data and availability of feedback. Again, because the human factor is absent (human cognitive psychology, that is) a giant unknown variable is not in play in the forecasting, which is a good thing. On the other hand, the anthro GHG effects are actually measurable and predictable variables that can be reliably plotted in the climate models. In all cases, the big unknown that we consistently overestimate our ability to predict is what humans will do next!
ReplyDeleteI don't doubt that some of my observations are in error as I am not a meteorologist or market wizard. I have merely extensively researched acquisition of skill and cognitive decision making. I hope cliff or someone else with skill can step in and confirm or deny
Here's just one example of Governmental agencies becoming completely politicized:
ReplyDeletehttp://www.eenews.net/stories/1060045642
So the poor little dears at the EPA needed their bankies, they're so upset at the election. Of course it follows that many took the day off, it was so upsetting to them. God forbid they start acting like adults and do their jobs. And they wonder why so many among their fellow citizens don't take them seriously anymore - does anyone remember that it was Nixon who originated the EPA in the first place? If people truly wish to have carbon laws put into effect regarding their concerns over GW, then they first have to realize that they must persuade others to their cause, and not hector, lecture and attempt to frighten them with apocalyptic scenarios if they don't do exactly as they wish.
Eric Blair, it is my understanding that those at EPA are upset because of what Trump has said he would do to their agency (see text and link below). Any scientist working for the agency should be very concerned when Trump appoints a climate change skeptic to head up the agency.
ReplyDelete16 of the 17 hottest years on record have been in the 21st century. The other one was 1998. No need to fear "apocalyptic scenarios" the rising levels of CO2 and resulting warming temps, melting ice, etc. are already here. Employees of the EPA know that and now also know their agency is slated to be run by a pro-coal lobbyist.
"Trump said would get rid of the Environmental Protection Agency, the agency created in 1970 by President Richard Nixon that has become the nation’s main federal lever for mitigating the impacts of climate change. “Environmental Protection, what they do is a disgrace,” he told “Fox News Sunday” host Chris Wallace in October of last year. “Every week they come out with new regulations. They’re making it impossible.”
http://www.salon.com/2016/11/14/denying-climate-change-is-only-part-of-it-5-ways-donald-trump-spells-doom-for-the-environment_partner/
No. Because in this case both the weather forecasters and political pollsters have a predetermined consensus outcome to which they are wedded. True forecasting requires more objectivity than we have seen in either profession in many years.
ReplyDeleteplease ignore my previous comment. It refers to a prior commentary, entirely missing, which explains how it might seem nonsensical
ReplyDeleteJohn, Trump cannot and will not get rid of the EPA, and if people who consider themselves scientists are so fragile that they must go home in order to nurse their hurt feelings then may I suggest they reconsider their current occupations. If things are so serious as you believe, running away to have Mommy and Daddy make the bad man go away is not going to help your case. This kind of behavior only strengthens the opposing side's arguments.
ReplyDelete