October 10, 2018

The U.S. GFS Beats the European Center Model on Hurricane Track--Again

Today, an extraordinarily intense Hurricane Michael hit the Gulf coast near Panama City, Florida, with devastating effects on some coastal towns, and plenty of damage inland.   The central pressure of Michael was roughly 919 hPa, making it one of the deepest (lowest pressure) hurricanes to strike the U.S. in a century.   To provide some perspective, a deep Pacific low hitting the Northwest coast might have a central pressure of 980 hPa, and our strongest storm in a century...the Columbus Day Storm of  1962...bottomed out at around 955 hPa.

But let's talk about the forecasts for Michael.     Number one, the track forecasts were stunningly good in the days before.  The graphic below shows you the forecasts by a number of numerical forecast models, all starting at 5 PM on Saturday.  Most of the models nailed the track.  The blue line is the current US global model (the GFS).   Of some worry, the proposed new global model (FV3, brown)....did not do as well.

But lets look at the track error a bit more quantitatively, by examining a graphic of the mean absolute  track error of various modeling systems for 12, 24, 48, and 72 h into the future (from the excellent website of Professor Brian Tang of  Univ. of AlbanY). AVNO is the current GFS model and ECMF is the famed European model.   Both are similar and quite good (less than 50 km error) at 12 hr, but by 72h, the European is MUCH worse (340 km error) compared to the spectacular GFS results (75 km error).

Interestingly, the European Center did much worse on Hurricane Rosa as well.

So my colleagues at the National Weather Service should be very pleased with the GFS track forecasts lately.   But there is a cloud on the horizon.   The new U.S. global model (FV-3) did worse on track, and as shown below it did much worse on intensity, measured by the central pressure of Michael.

This figure shows the forecasts of central pressure from various models (colored lines) for forecasts starting 5 PM Saturday versus the observed central pressure (black line).    The National Weather Service's flagship high resolution model (HWRF)--purple line-- did a decent job, but did not get the final intensification of the storm.    The current U.S. model (GFS, blue line) did almost as well.  But the new FV-3 model never really revved up the storm...a major deficiency.

 This is not the first time we have seen this behavior in FV-3 and it has gotten me worried.  Is there a problem with FV-3 that makes it unable to properly simulate hurricane intensification?   My colleagues at the NWS will have to examine this carefully.  FV-3 is young model and may need some work to be ready for the next hurricane season when it will be the operational model.


  1. Cliff, if you haven't already, I recommend you review the final MEG evaluation and Office of the Director brief for FV3GFS (available at the top of the FV3GFS evaluation page: http://www.emc.ncep.noaa.gov/users/Alicia.Bentley/fv3gfs/).

    Short version: the original FV3 parallel used hord (horizontal advection) option #6. This resulted in TCs that were too weak. After consultation with GFDL, the option was changed to #5, which eliminated much of the problem. We shouldn't expect current global resolutions to match the actual central pressure of TCs. GFS has a well known bias of over-strengthening TCs, so even in cases where it is "better", like Michael, it is often for the wrong reasons. The wind-pressure relationship is much improved in FV3.

    The more concerning issue with TCs in FV3 is a degradation in the longer range (> 5 days). The current hypothesis is that this is due to FV3 being too progressive with mid-latitude systems that are causing some TCs to recurve earlier, resulting in large position errors. This is something that is being investigated (along with the fast mid-latitude storms in general).

  2. When the NWS sent out the slides for the FV3 final review - there were noteworthy concerns from NHC...


  3. It makes one wonder how the FV-3 modeling algorithms are tested. Surely the computing scientists who created this know how to properly test their work.. or do they?

  4. Last hurricane didn't the FV3 do the best? Is this just a random variation in models? If so, is there a limit on how well the models can do (butterfly effect)?


The Fascinating History of Weather Forecasting and the Perfect Weekend Ahead: New Podcast Cover Both

My new podcast is out (see below).    My second segment tells you about the fascinating history of weather forecasting, including a surprisi...