Tuesday, April 2, 2019

The Genius of Crowd Weather Forecasting

Last quarter I taught Atmospheric Sciences 101 and as a fun extra-credit activity students had the opportunity to participate in a forecasting competition in which they predicted temperatures and probability of precipitation at Sea-Tac Airport.   The National Weather Service forecast is scored as well to provide a comparison to highly trained and experienced forecasts.  In addition, we averaged the prediction of all the students, producing what is known as a consensus forecasts.

Now who do you think won?   The pros at the Seattle National Weather Service office or the average of the inexperienced, weather newbies in my class?   The answer is found below--the consensus of the students was considerably superior to the Weather Service folks  (click on image to enlarge). 

Students were number two overall and the NWS was in sixth place.

A fluke ?    No--it happens this way virtually EVERY YEAR.  To illustrate this, here are the results for 2004.  In that year, the average of the students was fourth, the NWS experts were in 10th place.    You will notice that some individual students sometimes came in  first or ahead of the NWS...that could be just random luck due to the brevity of the forecast contest (1-1.5 months).

This phenomenon is often called the Wisdom of Crowds and has been the subject of a number of journal articles.    So why might an average of the students be better than a NWS forecaster?  Some possibilities include:

1.    The average forecast of a group will tend to damp out forecast extremes, which produce very bad scores when they are wrong.
2.  Students look at many different sources of information, using weather information in different ways and viewing many different forecasts (e.g., from various private sector groups).  Forecasts derived from an average of many different sources tends to be more skillful on average.
3.  Some of them might have took at look at superior forecasts, say form weather.com or accuweather.

I can think of other possibilities...perhaps you can too.  Several of the academic studies found that the crowds are only wise when the members make their decisions independently each other, thus bringing a variety of viewpoints and approaches.   If some folks are pushing their predictions on others or acting as "thought leaders", they could very well undermine the skill of the crowd forecasts.

Thus, diversity in prediction is important and should be fostered.    This finding should be kept in mind not only for weather forecasting, but for other prediction issues, such as climate prediction, where attempts to suppress diversity of viewpoints could produce negative results.

This wisdom of crowds finding is closely relate to why we make ensemble forecasts, running models many times, each slightly differently.  The average of these many forecasts is on average the best forecast to use.

So next time you need a forecast, trying averaging the guesses of your friends or classmates.  Does this idea apply to elections?  Now that is a subject I think I want to avoid. 


KIm Van Vleet said...

The National Weather Service issues Zone Forecasts out to seven days; Fire Weather Planning Forecast; Terminal Aerodrome Forecasts; Marine Forecasts; Hydrology Forecasts; Recreation Forecasts; issue watches, warnings, and advisories; coordinate those forecasts; respond to media and public inquiries, etc., but your students are forecasting for a single point for a couple of parameters. Who would do better? A student who has to learn and is tested on one chapter from a textbook or a student who has to learn and test on the entire textbook?

John Vidale said...

The real test, of course, requires enough rounds to iron out random fluctuations. For example, if the students guess worse, but the variance of the student average is still big enough that 20% of the time their average beats the pros, who also have week-to-week variance, there haven't been enough rounds compiled for a definitive contest.

If a group of students without self-calibration and limited information outperforms the pros, the pros are screwing up badly.

Twinkle said...

Thank you, Cliff, for your interesting blog on Crowd Weather Forecasting. As far as elections....that's something even I, a non- scientist, can easily "forecast". Given the fact that our political "climate" hasn't changed in a very, very long time and doesn't look like it's going to change soon, I can guarantee it's going to be more of the same.

Ludwigs60 said...

Fascinating. I'm going to get the book from my library. Thanks for tip.

Somebody said...

This is very similar to a Cold War problem/solution. A Russian sub sunk and we wanted to find it to steal it before the Russians could. So pinning down a location and focusing a search area was the most important thing. So the Navy took a room full of experts in soviet tactics, movements, wrecks and hydrology. The admiral in charge asked them to work independently and pinpoint exactly where they thought the sub was sunk based on its last known location. The admiral then averaged all the "guesses". The sub was nearly Exactly where the average of the guesses pinned it. We then took on the task of attempting to raise a missile from the bottom of the ocean. Read "Blind Man's Bluff"...a fascinating book.

Sharon said...

Interesting, Cliff!

Fold-It, if I remember the name correctly, was developed at UW Computer Science and Engineering. It was a video game designed to solve difficult scientific questions. They had some success using a group of gamers who solved their first question. I don't remember all of the details, but your example and theirs reminds us people can sometimes achieve more working together.

Snape said...

Or maybe they just watched the Weather Channel?? According to ForcastAdviser, the Weather Channel was the tops for Seattle last month. Near the top for 2018