COVID-19 Risk and Incidence

Recent case incidence is useful for estimating risk.

In these days and months of COVID-19, we wonder about risk. When I go out in public, what is my risk? The answer in part depends on prevalence: How many people in my area are infectious?

In the United States, we don’t have an adequate testing program, and we don’t know how high the prevalence is. What we do know is case incidence–the number of cases found today and in recent days. These counts are often listed in news sources or displayed like this:


The graph is a bit cluttered, and it’s not obvious what to do with the information. The daily counts fluctuate with day of the week reporting and other factors that carry little information. Most of these graphs include a curve that is a moving average of the daily counts to smooth out these fluctuations.

The case counts go up and down, but what does that tell us about current risk? The experts at Harvard Global Health Institute have developed a framework that adds context. It categorizes COVID-19 risk level as Green, Yellow, Orange, or Red based on daily case count per 100, 000 people.

For our use, it is valuable to combine the smoothed daily case count with this risk scale. COVID Action Network provides graphs that look like this:

CAN exampleThe stylish color change in the curve would delight data visualizers. Unfortunately it is likely to take too much cognitive effort for other viewers. Something simple and bold is probably better.


Alternative to captions

According to The Economist, India is held back economically because it has a weak middle class.

The article is 3000 words and has three graphs. How to connect the parts of the text to the appropriate graphs? Numbered captions are one standard technique, but keeping the figure together with a legible caption can be a challenge.

Running the figures inline and referring to them as the next figure or the figure below can work but has issues for multiple columns and page breaks.

A convenient solution is to place a figure number in the chart and use text like (see chart 2).

This works especially well with charts that are largely self documenting.

Compare to expectation

Spotify dominates streaming music, which has been a good thing for the music business.

This domination is even stronger than we might expect. The sizes of top ranked collections, for example the largest cities, are often related by a power law. This observation is called Zipf’s Law.

The number of Spotify subscribers is substantially larger than Zipf predicts.



Stacked area chart

Bitcoin is not the only digital currency. An article in The Economist discusses the most popular crypto-currencies. There are now approximately 1492!

The growth of these increasing number of currencies has reduced the market share of bitcoin.

As is usual with area charts, it is easy to follow the base area and the top area, but more challenging to understand the others. Bitcoin is dropping rapidly and now well under 40%. The other basket of currencies is increasing and now amounts to approximately 30%

The data source provides an alternate presentation, an overlapping area chart.

With this display, it is easier to follow the individual coins, but the chart includes too many of them. More important, almost all area charts are the stacked variety, so this one might be misinterpreted. It would probably be better to use lines.

It can be a bit easier to see which coin is most eating away at Bitcoin be reversing the stacking order.



Use relevant evidence

In October 2017, the Council of Economic Advisers, an agency within the office of the President, released a short report on the effect of a large proposed corporate tax cut on wage growth. The only graphical evidence included is

The presentation is clear, but a graph like this raises the obvious questions: Are these countries similar to the United States? What is significant about these four years?

Presumably the CEA has access to more clearly relevant data.

An analysis by the Economic Policy Institute comes to a very different conclusion. It includes this graph

The evidence is less compelling. You could argue that the 1986 corporate tax cut led to a modest increase in compensation, perhaps through productivity growth. Or perhaps the tax cut interrupted the decades long decline of compensation growth. In any case the United States has not experienced a compensation growth of 2% or more for at least 40 years.

Stacked bar chart

A recent article in The Economist discusses the spread and possible future struggle of Amazon and Alibaba. Many countries offer room for growth.

This is an excellent example of a stacked bar chart. It avoids common problems.

The bars sum to 100%, so their heights are uniform and comparison is easy. With only 3 categories, there are not too many labels or tiny bars.

The messages are clear. In India and China, many people are not yet online. In every country except India, a strong majority of people online are shopping there.

Log scale

Bitcoin is a major phenomenon of out time. There is nothing like very rapid growth to generate excitement.

As is often the case, this linear scale plot obscures rate of change and the small early values. It is worth also looking at a log scale plot.

The early adopters had the best earning potential.

Some people talk about the promise of bitcoin as the new money. Of course this price history shows that it would have been amazingly deflationary. Hardly a good thing.

The price of bitcoin is financial, but its cost is environmental. “Mining” bitcoin takes a huge amount of electricity, and the major mining countries rely heavily on coal fired power plants.

The scale of this electricity consumption can be compared to countries.

The choice of a bar chart is appropriate, as is ordering the countries so that the longest bar is at the bottom, nearest to the scale. The small drop shadows, however, only add blur.

Meaningless bar chart

In October 2017, the Council of Economic Advisers, an agency within the office of the President, released a report on the effect of a proposed corporate tax cut on wage growth. It was not widely praised. The report predicts that:

“Reducing the statutory federal corporate tax rate from 35 to 20 percent would, the analysis below suggests, increase average household income in the United States by, very conservatively, $4,000 annually.”


“Using 2016 household income as the baseline, these effects translate into an increase in average household income from $83,143 in 2016 to between $87,520 and $92,222, an increase of $4,000 to $9,000 in wage and salary income alone. (See Figure 2.)”

The bar chart in Figure 2 clearly and compellingly shows that 9000 is more than twice as big as 4000. The competition is strong, but this is clearly a candidate for dumbest graph of 2017.

One of the basic guidelines in graph design is that the quantity axis of a bar chart should include zero. In this case, that means plotting income and changes. If a bar chart is needed, it should look more like this.

The predicted increases are 5 and 11%. The authors do not state when the income increase will be fully achieved, but the report suggests 7 years out. This would correspond to annual growth of 0.6 to 1.3%, similar to the 1.1% average rate that occurred from 2008 to 2016.

Perhaps the authors are asserting that the income growth rates would add so that the total would be about 2%. That is a growth rate not seen since 1974, when the corporate tax rate was much higher.

Who are these people? Is that the best they can do for justification?One hint is the the Chair is Kevin Hassett, perhaps best known as the co-author of Dow 36,000 in 1999.

Unnecessary combined chart

The National Institute of Health tracks the number of deaths from opioids. The statistics are discouraging.

It is relatively common to use a combined bar and line graph when there are two vertical axes with differing scales. This example is more puzzling. A stacked bar graph or a line graph with 3 lines would convey the data in a more standard way.

It is not clear that the sex curves are even needed. There appears to be a consistent 60-40 split that could just be mentioned in the text.

The title and subtitle could be rewritten to point out that there has been a 2.8x increase from 2002 to 2015.


The Nikkei 225 recently reached a 21 year high according to The Economist in the October 12, 2017 issue.

The graph is a bit misleading. Because the vertical axis starts at 5000 rather than 0, the index may appear to have tripled since its low rather than doubled.

It is certainly okay to start an axis at a value other than zero for a line graph. But in this case, there is no downside in starting at zero. And our initial impression would be more accurate.

There is another concern: What is special about 1996? Was it a high point?


Including the high point in 1989 would have made a more complete, if longer story.