Sunday, February 16, 2020

U.S. Operational Weather Prediction is Crippled By Inadequate Computer Resources

U.S. global numerical weather prediction has now fallen into fourth place, with national and regional prediction capabilities a shadow of what they could be.

There are several reasons for these lagging numerical weather prediction capabilities, including lack of strategic planning, inadequate cooperation between the research and operational communities, and too many sub-optimal prediction efforts.

But there is another reason of equal importance: a profound lack of computer resources dedicated to numerical weather prediction, both for  operations and research. 


The bottom line:  U.S. operational numerical weather prediction resources used by the National Weather Service must be increased 10 times to catch up with leading efforts around the world and 100 times to reach state of the science.  

Why does the National Weather Service require very large computer resources to provide the nation with world-leading weather prediction?

Immense computer resources are required for modern numerical weather prediction.  For example, NOAA/NWS TODAY is responsible for running:

  • A global atmospheric model (the GFS/FV-3) running at 13-km resolution out to 384 hours.
  • Global ensembles (GEFS) of many (21 forecasts) forecasts at 35 km resolution
  • The high-resolution Rapid Refresh and RAP models out 36 h.
  • The atmosphere/ocean Climate Forecast System model out 9 month.s
  • The National Water Model (combined WRF and hydrological modeling)
  • Hurricane models during the season
  • Reanalysis runs (rerunning past decades to provide calibration information)
  • Running the North American mesoscale model (NAM)
  • Running the Short-Range Ensemble Forecast System (SREF)

This is not a comprehensive list.  And then there is the need for research runs to support development of the next generation systems.  As suggested by the world-leading European Center for Medium Range Weather Prediction, research computer resources should be at least five times greater than the operational requirements to be effective.

NY Times Magazine: 10/23/2016

How Lack of Computing Resources is Undermining NWS  Numerical Weather Prediction

The current modeling systems (some described above) used by the National Weather Service are generally less capable then they should be because of insufficient computer resources.  Some examples.

1.  Data Assimilation.  The key reason the U.S. global model is behind the European Center and the other leaders is because they use an approach called 4DVAR, a resource-demanding technique that involves running the modeling systems forward and backward in time multiple times.  Inadequate computer resources has prevented the NWS from doing this.

2.  High-resolution ensembles.   One National Academy report after another, one national workshop committee after another, and one advisory committee after another has told NWS management that the U.S. must have a large high-resolution ensemble system (at least 4-km grid spacing, 30-50 members) to deal with convection (e.g., thunderstorms) and other high-resolution weather features.  But the necessary computer power is not available.

European Center Supercomputer

3.  Global ensembles.  A key capability of any first-rate global prediction center is to run a large global ensemble (50 members at more), with sufficient resolution to realistically simulate storms and the major impacts of terrain (20 km grid spacing or better).  The European Center has a 52 members ensemble run at 18-km grid spacing.  The U.S. National Weather Service?  21 members at 35-km resolution.  Not in the same league.

I spend a lot of time with NOAA and National Weather Service model developers and group leaders.  They complain continuously how they lack computer resources for development and testing.  They tell me that such resource deficiency prevents them from doing the job they know they could. These are good people, who want to do a state-of-the-art job, but they can't do to inadequate computer resources.

NOAA/NWS computer resources are so limited that university researchers with good ideas cannot test them on NOAA computers or in facsimiles of the operational computing environment.  NOAA grant proposal documents make it clear:  NOAA/NWS cannot supply the critical computer resources university investigators need to test their innovations (below is quote from a recent NOAA grant document):


So if a researcher has a good idea that could improve U.S. operational weather prediction, they are out of luck:  NOAA/NWS doesn't have the computer resources to help.  Just sad.

U.S. Weather Prediction Computer Resources Stagnate While the European Center Zooms Ahead

The NOAA/NWS computer resources available for operational weather prediction is limited to roughly 5 petaflops (pflops).   Until Hurricane Sandy (2010), National Weather Service management was content to possess one tenth of the computer resources of the European Center, but after the scandalous situation went public after that storm (including coverage on the NBC nightly news), NOAA/NWS management managed to get a major increment to the current level--which is just under what is available to the European Center.

Image courtesy of Rebecca Cosgrove, NCEP Central Operations

  But the situation is actually much worse than it appears.   The NWS computer resources are split between operational and backup machines and is dependent on an inefficient collection of machines of differing architectures (Dell, IBM, and Cray).  There is a bottleneck of I/O (input/output) from these machines (which means they can't get information into and out of them efficiently), and storage capabilities are inadequate.

There is no real plan for seriously upgrading these machines, other than a 10-20% enhancement over the next few years.

In contrast, the European Center now has two machines with a total of roughly 10 pflop peak performance, with far more storage, and better communication channels into and out of the machine.

And keep in mind that ECMWP computers have far few responsibilities than the NCEP machines.  NCEP computers have to do EVERYTHING from global to local modeling, for hydrological prediction to seasonal time scales.  The ECMWF computers only have to deal with global model computing.

To make things even more lopsided, the European Center is now building a new computer center in Italy and they recently signed an agreement to purchase a new computer system FIVE TIMES as capable as their current one.


They are going to leave NOAA/NWS weather prediction capabilities in the dust.  And it did not have to happen.

And I just learned today that the UKMET office, number two in global weather prediction, just announced that it will spend 1.2 BILLION pounds (that's 1.6 billion dollars) on a new weather supercomputer system, which will leave both the European Center and the U.S. weather service behind.   U.S. weather prediction will drop back into the third tier.


Fixing the Problem

Past NOAA/NWS management bear substantial responsibility for this disaster, with Congress sharing some blame for not being attentive to this failure.  Congress has supplied substantial funding to NOAA/NWS in the past for model development, but such funding has not been used effectively.

Importantly, there IS bipartisan support in Congress to improve weather prediction, something that was obvious when I testified at a hearing for the House Environment Subcommittee last November.  They know there is a problem and want to help.

There is bipartisan support in Congress for better weather modeling

A major positive is that NOAA is now led by two individuals (Neil Jacobs and Tim Gallaudet), who understand the problem and want to fix it. And the President's Science Adviser, Kelvin Droegemeier,  is a weather modeler, who understands the problem. 

So what must be done now?

(1)  U.S. numerical prediction modeling must be reorganized, since it is clear that the legacy structure, which inefficiently spreads responsibility and support activities, does not work.  The proposal of NOAA administrator Neal Jacobs to build a new EPIC center to be the centerpiece of U.S. model development should be followed (see my blog on EPIC here).

(2) NOAA/NWS must develop a detailed strategic plan that not only makes the case for more computer resources, but demonstrates how such resources will improve weather prediction.  Amazingly, they have never done this.  In fact, NOAA/NWS does not even have a document describing in detail the computer resources they have now (I know, I asked a number of NOAA/NWS managers for it--they admitted to me it doesn't exist).

(3)  With such a plan Congress should invest in the kind of computer resources that would enable U.S. weather prediction to become first rate.  Ten times the computer resources (costing about 100 million dollars) would bring us up to parity, 100 times would allow us to be state of the science (including such things as running global models at convection-permitting resolution, something I have been working on in my research).

Keep in mind that a new weather prediction computer system would be no more expensive that a single, high tech jet fighter.  Which do you think would provide more benefit to U.S. citizens?  And remember, excellent weather prediction is the first line of defense from severe weather that might be produced by global warming.

82 million dollars a piece

(4)  Future computer resources should divided between high-demand operational forecasting, which requires dedicated large machines, and less time-sensitive research/development runs, which could make use of cloud computing.  Thus, future NOAA computer resources will be a hybrid.

(5)  Current operational numerical prediction in the National Weather Service has been completed at the NCEP Central Operations Center.  This center has not been effective, has unnecessarily slowed the transition to operations of important changes, and must be reorganized or replaced with more facile, responsive entity.


U.S. citizens can enjoy far better weather forecasts, saving many lives and tens of billions of dollars per year.   But to do so will require that NOAA/NWS secure vastly increased computer resources, and reorganize weather model development and operations to take advantage of them.








25 comments:

  1. It is all very simple. For the typical American, computers employed doing weather prediction are not seen as money-makers, so they don't get any support.

    As for fighter jets, they are OK because fighter jets "blow things up".

    End of story.

    ReplyDelete
  2. This should be left to the private sector. If there is no political will, than a profit motive is the next best thing. There certainly is NO political will as long as Trump and his posse of sycophants cling to the ideological bent that climate change is a hoax, a liberal conspiracy as well as a job killer. Hence money for ANY climatology research is anathema. If anything, expect more cuts. Not funding boosts.

    As far as bipartisanship...there is also bipartisan support for roads, bridges and other infrastructure needs. Yah, not much happening there either other than a wall on the southern border. There will be no surprise when the day comes that a "Go Fund Me" page goes up to pay for a new bridge on the interstate. Disappointing, but not surprising.

    It isn't just a lack of faith in federal government weather prediction, Cliff. Its a general lack of faith in the federal government, period. Its broken, dysfunctional and toxic. Programs that are designed for the common good are just pawns to get shuffled around in the political game. If a program or agency doesn't fit into Trump's agenda, than its a funding source to be cut to pay for what is. Nothing will change until that big picture becomes much more positive. In the mean time, IBM seems to be doing a good job. Maybe they should be encouraged to take it up a notch. After all, they probably BUILD the hardware that NOAA needs.

    ReplyDelete
  3. Chris, have you been following the DoD weather forcasting?

    I was at a DARPA ERI conference and a member of the DoD weather forcasting group was presenting. Run forecasts every 6 hours for all locations where military is present. Had a 5 day accuracy - increased the amount of compute power by 1000x and only increased accuracy to 6 days.

    ReplyDelete
  4. Privatize everything. This is the way.

    ReplyDelete
  5. One strike of lightning at Bellevue about 7:20 PM

    ReplyDelete
  6. Yes, it's worthwhile. It's a big country; capacity to compute verification feedback of obervation data is essential. No thumbs on scales, of course. Real, honest, agenda free science - transparent, testable and untainted (integrity all the way).

    Breaks my heart to see some politicize something we all want and need. If there ever were a unifying cause, this sure looks like it.

    Constructive piece, Dr. M. Thanks for the update.

    ReplyDelete
  7. Which would provide more benefit to U.S. citizens is a question that can be definitively answered only given a specific context. It's also a question for which the answer can only be derived via the exercise of the uniquely human capacity for reason. However, the unique faculty of reason possessed by humans is just a(n occasionally fortunate) byproduct of our highly evolved aptitude for navigating complex social interactions/networks. A related/relevant question that can easily be answered without caveat or reservation is: does the average American taxpayer find a French-made BullSequana XH2000 supercomputer or an American-made F-22 Raptor sexier and more compelling? The F-22 Raptor is a killing machine. The BullSequana XH2000 is a computing machine and, whether rational or not, the average U.S. taxpayer only views one of these as a safeguard at the threshold of life and death. Until John Q. Public believes that the weather is a greater threat to he and his family's immediate safety than foreign terrorists we can all rest fitfully knowing that taxpayer dollars will be disproportionately prioritized toward thrust-vectoring, radar-absorbing paint and laser-guided munitions over computational efficiency, power and storage capability. Our ability to use reason to discern objective reality is a function of the degree to which our perception of objective truth is socially beneficial.

    ReplyDelete
    Replies
    1. Point of information: the image above is an image of the JSF. The F-22 costs roughly 2.5x more per aircraft than the JSF.

      Delete
  8. Taking a PNW focus, we also need more and better data to initialize the models. Things like another radar covering the Oregon coast, but also a better source of data from the North Pacific than our current, utterly decrepit buoy network. As they say, garbage in, garbage out. And right now we're poised to put any new computing resources on the data version of a beer-and-twinkies diet.

    ReplyDelete
  9. It seems to me that in terms of the US budget, the solution would simply be to buy Cray's latest new machine constantly, since now they can only buy Cray machines thanks to the USA ONLY requirement for hardware.

    It's kind of disturbing when I compare all of NOAA and NCAR to our little 15 person company in terms of computation power. And, of course, since weather is a very, very parallel problem, a proper board full of multicore chips can provide a frightening amount of compute power if one can afford to redesign the mother board appropriately.

    ReplyDelete
  10. Cliff, Have you considered getting this blog post to Bloomberg's people?

    He seems open to climate change, and has plenty of money. Perhaps our state senators could help you be seen and heard.

    You might consider coupling this issue with the Oregon coast radar need too.

    ReplyDelete
  11. What is the chances of getting NOAA to cease and desist LYING about the global warming HOAX manipulating and falsifying data like they have been doing for YEARS!!!

    ReplyDelete
  12. Yeah Sharon, that's just what we need, an oligarch trying to buy his way to the presidency via his willing supplicants in the NGO's and in the MSM. What could go wrong?

    ReplyDelete
  13. Private industry can do weather prediction, but no company would ever take on the responsibility for public weather warnings, due to the extreme liability. This is why it falls to government, both in this country and around the world.

    Since we need NWS to perform this life and property saving mission, let's give them the tools they need to do the job properly.

    ReplyDelete
  14. Is there any proof of said lies?

    Also is there a weather modeling/forecasting source you would rather utilize that conforms with your politics or you happy with just looking out the window before you set off for your day's activities?

    Something something about this is why we can't have nice things.....

    ReplyDelete
  15. www.engadget.com/amp/2020/02/17/uk-weather-supercomputer-most-powerful-yet/

    ReplyDelete
  16. "It isn't just a lack of faith in federal government weather prediction, Cliff. Its a general lack of faith in the federal government, period. Its broken, dysfunctional and toxic. Programs that are designed for the common good are just pawns to get shuffled around in the political game. If a program or agency doesn't fit into Trump's agenda, than its a funding source to be cut to pay for what is. Nothing will change until that big picture becomes much more positive. In the mean time, IBM seems to be doing a good job. Maybe they should be encouraged to take it up a notch. After all, they probably BUILD the hardware that NOAA needs."

    I can shed a little light on this.

    Until recently, I was an infrastructure/Linux engineer for the NWS. (Grew weary of it and bailed to better things. And let me tell you, they are better indeed. The gummint paycheck isn't worth it, folks.) While weather is interesting to me in a certain academic and scientific sense, I don't care about it too much, it being a special case of a general problem in my field: how to make sure that "infrastructure needs" == "infrastructure actuality". The simple truth is this: the models and the science are largely irrelevant, since you cannot scale past your infrastructure. If you don't solve the infrastructure problem first, whatever dreams of "better models" you have are doomed to failure.

    You can attempt to solve this problem by throwing huge amounts of money at it. Mostly, all that does is make your vendor rich; it doesn't seem to save any money in either manpower or capital, blather about "core competencies" notwithstanding. I have observed this pattern enough for it to be a presumptive law, in my opinion. This doesn't work if you don't have the money.

    So, I pushed for an OpenStack-based private cloud platform using recycled AWIPS hardware as "seed" hardware, which is still in use in an NWS Region (though probably not for long). (con't)

    ReplyDelete
    Replies
    1. The virtue of this approach (OpenStack, private cloud) is that once implemented, computing resources can be provisioned from a pool that can be added to with relative transparency, exactly the same as you would do with Amazon; then, for the amount of money you would spend for a Cray, or on m4.2xlarge instances, or whatever, you can toss a metric shitload more commodity compute at whatever problem you have. We cut down server provisioning times from "years" to "hours". Now, this doesn't scratch every computational itch, of course - interconnects are still important - but it's amazing what you can accomplish with a technically-suboptimal solution that is more ergonomic and simple than a theoretically-better-performing but less-friendly solution. (Think "Ethernet" vs. "Token Ring", or "TCP/IP" vs. "XNS".) Couple that with an institutional mandate to purge and refactor all of the shit-tier "grad-student" code that seems to be the underpinnings of nearly every scientific application in the Weather Service, and enforcing development standards based on best practices from this century and not the 18th, and the US might have some opportunity to not have to rent Europe's weather-prediction models.

      Since there is no institutional motivation for such a thing, though - the folks in Silver Spring would prefer to focus on "centralizing all the things" (in opposition to observed reality and human behavior since the year dot) and glitzy national models that divide the world into "the ACELA corridor" and "the rest of the planet, that we don't care about" (while still underperforming; FV3, quod vide); the vendors would like to continue milking this cash cow and delivering as little as possible until the contracts run out; the managers want to climb the ladder to maxx their High-3; and, the employees mostly want to just get to that Magical Government Pension - this is not going to happen. Which is kind of too bad, because I know that there are good people working there that would *like* to make something awesome, like we used to, back when the government actually made interesting things that worked instead of gargantuan world-melting clusterfucks. So, c'est la vie. Plus ca change, plus ca doesn't. Sad. Many such cases.

      Delete
    2. Everyone has an agenda, when one has power one uses that power to further their agenda. This president and every other has done the same.

      Delete
  17. You need to update your price sheets for jet fighters Cliff. The new B-2 Spirit fighters cost $737 million per each, that we know of. Other costs are probable but hidden as development or some such.

    ReplyDelete
  18. Today, NOAA announced: "U.S. to triple operational weather and climate supercomputing capacity"

    https://www.noaa.gov/media-release/us-to-triple-operational-weather-and-climate-supercomputing-capacity

    ReplyDelete
  19. Perhaps because voters and representatives know that if they did approve funding it would likely be used to further an hysterical climate change agenda instead of solid improvements weather prediction.

    ReplyDelete
  20. This comment didn't age well...even a few days!

    "There is no real plan for seriously upgrading these machines, other than a 10-20% enhancement over the next few years."

    How's 12 petaflops per system?

    ReplyDelete