April 07, 2020

Flying Blind on Coronavirus: Why Random Testing is so Important.

The coronavirus crisis is one of the most disruptive events to hit the U.S. in a very long time.

Major sectors of the economy are being shuttered, people lives are being altered in the most profound ways, and the nation is facing extreme stress, the implications of which we do not understand.

Extraordinarily serious decisions are being made without key information:  how many individuals have active infections?  How many have had the virus and now have immunity?  What percentage of infected individuals have few or no symptoms?  Who is currently infected and needs to be quarantined?  Is the current reduction in cases in Washington and elsewhere mainly the result of social distancing or the herd immunity of an increasing number of individuals that have had the virus?

For all these questions, we do not know the answer. 

Our best medical scientists and epidemiologists, including a highly respected group at the University of Washington, are making projections of the future progression of the pandemic, but without sufficient information for initializing their models, resulting in a rapid decline of accuracy with time and large uncertainty in the projections.

We are flying blind.

Lack of information undermines our ability to manage the crisis.  And it doesn't have to be this way.

The bottom line of this blog is that we must immediately begin random sampling of the population using both PCR testing that tells up about active infections and antibody tests that informs about who has been infected int the past.

Random sampling of populations is an essential tool for the social, biological, and physical sciences. 

Political pollsters randomly sample potential voters to predict the outcomes of elections.   They don't provide election projections by counting the number of avowed Democrats or Republicans who come knocking on their door.   In a variety of fields, random sampling of populations is a key tool for decision making.  But in the coronavirus situation, we are content not to use this powerful tool, even when we make the gravest decisions.  It doesn't make sense.

Here in Washington State, virtually all the testing is being used to determine whether individuals suffering from a respiratory illness ailments have active coronavirus infections.  There is, of course,  good reason to test sick individuals:  their treatment plan can be enhanced with such knowledge and their care-givers need to know whether they require protection (PPEs-- personal protection equipment).  We know how many folks are getting tested (and not everyone that is sick is getting tested) and the percentage of those tests that are positive for active COVID infections.  That is not enough.

The current testing regime leaves decision makers poorly informed.   How many individuals currently have active infections, with or without symptoms?  How many people have had the virus and are now potentially immune?  We don't know.  And without such information, it is nearly impossible to project the future.

Some of the best information we have on infections locally comes from the testing done by the UW Virology Department.  As of yesterday, they had tested about 51,000 individuals, with roughly 10% testing positive.  Surely, the percentage of the total population that is currently infected with COVID-19 must be less than this percentage derived from sick folks (who can be ill for a number of reasons)--but how much less?  Testing ramped up in March went goes up and down.

Below is a figure that has never been shown in the media:  the ratio of positive to negative results from the UW Virology testing.  The ratio substantially increased between mid-March to the end of the month (from roughly 6% to  17%), suggesting that a higher proportion of tested sick people were suffering from COVID-19.  Importantly, that number has been declining rapidly in April, suggesting major progress.

The Washington State Department of Health provides statewide totals of confirmed cases (see below), which have increased substantially since early March.   But how much of the increase in the number of confirmed infections is the result of vastly increased testing during the past several weeks (as shown by the second figure) and how much by increased presence of COVID-19 in the population?  We simply don't have that information.

Epidemiological projection, like numerical weather prediction, is an initial value problem.  You start with an estimate of the initial state and your model, which contains information about the processes of the phenomena in question, attempts to project the initial state into the future.  If your initial state is uncertain, so is your forecast.

There is an intrepid group at the University of Washington (IHME) attempting to do such projections using a statistical model, with COVID-19 forecasts for both individual states and the nation. Decision makers are making heavy use of their estimates.   Here is a sample of their latest forecasts for Washington State deaths from the virus. The observed COVID deaths are shown by the solid red line.  The dashed line shows their best estimate of the future and the shading indicates the uncertainty in the projections.  HUGE uncertainties, in part because of the uncertainties in our knowledge of  past and current active infections and the  "herd immunity" of folks that have recovered from the virus.

But poor initial data (lack of knowledge of infections in the total populations) has resulted in their projections changing substantially in time, undermining their value to decision makers.

Below are their projections starting on March 26, April 1, and April 5.  Huge differences are apparent. Their projection made on March 26th (when they started this valuable service) are shown below. The light gray line is their best projection (actually the median of their distribution) and the uncertainty is noted by light gray shading.  This projection shows a peak in late April of around 27 deaths a day, and high numbers continuing into May. 

The April 5th projection (solid red line, again the median of their forecast distribution) is much more benign, showing the epidemic collapsing by the end of April with far less than half the deaths, and the peak being in early April.  According to their latest projections, we are now probably past the peak, something consistent with the drop in hospitalizations in our state due to the virus (see below).  I suspect that the next project will be even lower.

Washington State is rapidly getting out of this terrible situation due to the combination of social distancing and the building immunity of the population.  But we are pretty much ignorant about the magnitude of the herd immunity because we do not know how many have had and currently have the virus.

Some researchers are trying to indirectly secure an idea of what has happened using a combination of genetic testing of viral samples coupled with epidemiological modeling.  A recent paper by Bedford et al (2020, not yet peer reviewed) suggests that the virus reached Washington State during mid to late January and then spread throughout the local population, asymptomatic for many. 

Since random sampling was not available, they used mathematical models to estimate spread (see below) using a large number of simulations informed by their genetics testing.  The red line is the median of these simulations (you might consider that to be their best estimate).  They suggest that by mid-March roughly 2000 Washington citizens were infected as the infection increased exponentially in the population.  Without any influence of social distancing, their work suggests the potential for roughly 10,000 cases by April 1 (simply by extrapolating the exponential rise of the red line).
Unfortunately, we lack the information to know what has happened here in Washington.   But is clear there this virus has been spreading among us for at least two months, with many individuals unaware of being infected.  There is a substantial evidence for this asymptomatic spread such as the example of the Diamond Princess cruise ship, a vessel in which the disease went rampant and nearly everyone was tested.  Roughly 50% of the sick, in a population heavily skewed towards older folks, had no symptoms.  Could it be even higher in a younger, healthier population?  We don't know.

The wake up call for our State was when the spreading virus randomly hit a nursing home full of elderly, sick individuals in Kirkland, producing dozens of deaths and serving as a focus for spread into the community.  What would have happened without that alarming situation?

We need random sampling of the population.

This situation of not knowing what is happening in the general population is crippling society's ability to deal with the virus in an informed, smart way.  The capability to test is increasing now and our State and nation must give random testing of the population very high priority immediately, securing a significant percentage of the rapidly increasing testing capability to random sampling.

Such sampling offers enormous benefits.  First, it will greatly improve the quality of our projections of the disease.  Second, it will tell us why the situation is improving (social distancing versus herd immunity) and guide governmental actions.  Third, it will enable us to squash the disease spread, by quarantining every individual that tests positive for an active infection, and to trace (and quarantine) his/her contacts.  South Korea has shown the huge value of the widespread testing/quarantining approach.

It is the only way we can stay on top of the virus when we inevitably have to loosen the restrictions on our population, which we will have to do to prevent economic collapse and social disorder.

Here in Seattle, we have some of the most sophisticated mathematical modelers, statistical analysts, medical experts, and private sector analytic experts in the world, yet we are content to cripple our economy and deal with such an historic event in nearly complete ignorance of the situation. We, of all places in the world,  can do much better.    It is time for random sampling of our population for COVID-19 looking for both active and past infections.

When the story of this event is written, the inability to rapidly initiate and actively use large scale random testing back in February will be seen as a terrible failure of the Centers for Disease Control,  state health organizations, and our Federal/state leadership that did not demand it.  And this failure was compounded by other issues, such as not encouraging all Americans to wear masks early in the event and not moving immediately to secure necessary PPE and other supplies.

But we are where we are and it is time now to move from a reactive to more reasoned approach, with random testing of the population being a measure that needs immediate priority.


  1. A big question is how many people need to be tested to come up with significance. If the percent of people who had the disease is still very low the number to be tested could be very high. If they tested all health workers and grocery workers, who have higher possibility of contracting the virus and then assess how much less exposure the general population has that could be useful information.

    Health workers with this degree of exposure have contracted the disease at x%
    Grocery workers with their degree of exposure have contracted the disease at y%
    Perhaps there is a 3rd or fourth group who could also provide independent numbers

    Thus people with alpha to delta amount of exposure likely are infected at certain rates

  2. It would be great if they'd expand SCAN(scanpublichealth.org) to the whole state and add an antibody test.

  3. Lovely analysis. We need universal testing or at least random sampling of the population with an antibody test, not just a virus test. It would be very useful if we found that a substantial number of people had recovered from the disease and herd immu
    nity was further along than we thought.

  4. Cliff, a statistical clarification: the April 5th projection is not "radically more benign." It is more precise. The uncertainty intervals for all three projections almost entirely overlap. The projections from April 5th are completely consistent with the projections of March 26th. The light grey line is not their "best projection" it is simply the median or mean of their posterior. That line is no more likely than a line drawn at the 10th or 90th percentile. Your point that more data would likely lead to a better model is valid. That's one of the reasons why the precision of the model increased between March 26th and April 5th: they had more data on April 5th. Another reason is simply that predictions closer in time will be more accurate than predictions further in time, so the prediction for April 10th made on March 26th will always be less accurate than the prediction made on April 5th.

    Source: I am a professional statistician.

  5. Cliff
    Those of us who follow your blog know that you advocated for this same testing over 3 weeks ago and still nothing has been done. You are a smart and reasoned professor and it too bad more of our thought leaders don't take your advice.

  6. Thank you! Thank you! Thank you for explaining this. I am not an expert, but I get it. Unfortunately, a lot of people don't.

  7. I am on board with the conceptual notion of random sampling and the benefits thereof. I guess the key question is how do you operationalize the execution of random sampling? Let's say you want to test 1,000 people. How do you determine who those 1,000 people are? State health authorities could walk through neighborhoods and knock on every nth door, but they cannot compel people to submit to a test. If a high percentage submitted to the test, then it's probably close enough to random to be valid, but if not, then the opt-in nature of the testing may be a biased result. Alternatively, if you put out an ask for volunteers to be tested, then your sample is no longer random, because people exhibiting symptoms might be more likely to volunteer.

  8. Although I'm a 'soft science' baby (I did have to take stats for my MA) I have wondered this very thing for some time. It is the only way to get a good idea of prevalence. So many things at play here minimizing this possibility. In Oregon, as far as I can tell only people admitted to the hospital are being tested! I know several folks who were ill and turned down for testing. How can we get at accuracy with this kind of thing happening!

  9. If only we had leaders. Instead we have a group in office who only care to follow. As soon as Cali or ny decide to reign in the apocalypse I'm sure Jay and his group will follow quickly behind. And of course claim a victory.

    If we're lucky.. before the middle class is cut from the economy. some brave group of enlightened people will storm the capitol and throw the numbskulls out .. not likely, but can dream.

    Welcome everyone to Jay's socialist state of Washington. Where we're all unemployed.. unless minimum wage, or able to work from a screen at home.

    Very sad days ahead.. we've elected people who made the bold, brave decision to copy China. Of all the systems to copy.. China. Inspired work guys. Should be proud of yourselves.

    Peace and health to us all. Freedom not available anymore.. had to sacrifice that for the greater good (flatten the curve). Once the gov takes something, very rare for them to return. Was it worth it?

    I say no

    Submitted for your review by a grumpy small business owner. Good luck everyone!

  10. 15 minute tests at the airport will keep air travel safe and prevent additional cases from arriving. The Taiwan model needs to be adopted. Please enforce mandatory tests to board a plane and for all new arrivals. The airports are the reason this has spread so far and wide.

  11. Cliff - I am an occupational medicine physician and have this to contribute. I have been processing "return to work" cases for people who are recovering from or have recently recovered from various coughs and colds including confirmed COVID-19. The lack of availability of testing is just incredible. I am "flying blind" in terms of providing reassurance to the companies that we serve that their workers are free of COVID-19.

    I think we all have a very false sense of security in thinking that we're safe because we're only going out when we need essentials like groceries. But when you think about the fact that hundreds of "asymptomatic" shoppers are funneling through a few dozen "asymptomatic" cashiers in a given day, the potential for spreading is absolutely terrifying.

  12. How do we take action and get this antibody, random testing done?! What can we do? It needs to happen fast and bad! We need our economy up and running very badly!

  13. I think it is highly unlikely that herd immunity is having any measurable effect at this point. Here is why:
    * True herd immunity, where a diseases may pop up but not spread, requires something like 70% - 90% of the population to be immune.
    * I would estimate that we need >10% immunity before we start to see a measurable decline in infection rate
    * Currently, 0.2% of King County has been diagnosed positive because they felt sick enough to get tested.
    * So if 10% of King County is immune, there are / were nearly 50 positive-but-untested people for every positive test.
    * If that were truly how the virus is spreading (vast majority not knowing they have it), then contact tracing would be completely useless: everyone who is sick would have a high likelihood of having contracted it from someone who wasn't sick. But we see Korea and Taiwan (and maybe China, if we believe their data) successfully using contact tracing to isolate cases and slow spread.
    * I've read that experts estimate maybe 25% of cases are asymptomatic (not 5000%), so we are still a few orders of magnitude away from any heard immunity.

    1. hi Ted, consider that only 2.5% of all PCR+ cases have been connected to another known case. In other words, almost everybody that we have found to be sick (at the hospital) with COVID-19 got it from someone that we don't know about, that is, is NOT being accounted for in our statistics and projections. So, we are likely very close to herd immunity. Also, 25% as ASYMPTOMATIC is a very high rate, and if you pile on the rest of the people who ARE symptomatic but not sick enough to go to the doctor/hospital, then we are almost certainly at herd immunity levels.

    2. This is great analysis. Cliff Mass Weather Blog, you should clean this up and submit it to local news stations, The National Review, and other points of distribution that haven't adopted social distancing as their new religion. We need a plan to do random sampling with PCR and antibody YESTERDAY.


Please make sure your comments are civil. Name calling and personal attacks are not appropriate.

Thunderstorms Return to the Northwest

 Thunderstorms have been relatively rare this summer, but today will see some boomers over the Cascades and eastern Washington. In fact, the...