The U.S. is a small fraction of the global area …

In response to Steve McIntyre's discovery of errors in the temperature data provided by the GISS, Gavin Schmidt has stated that the errors and the resulting corrections are not material as:

There were some very minor knock on effects in earlier years due to the GISTEMP adjustments for rural vs. urban trends. In the global or hemispheric mean, the differences were imperceptible (since the US is only a small fraction of the global area).

The emphasis is mine.

On the other hand, Hansen et al. (2001) state:

Although the contiguous U.S. represents only about 2% of the world area, it is important that the analyzed temperature change there be quantitatively accurate for several reasons. Analyses of climate change with global climate models are beginning to try to simulate the patterns of climate change, including the cooling in the southeastern U.S. [Hansen et al., 2000]. Also, perceptions of the reality and significance of greenhouse warming by the public and public officials are influenced by reports of climate change within the United States.

As before, the emphasis is mine.

I am not going to delve into methodological problems with calculating a global average temperature while ignoring missing data (most stations in the GHCN data set which the GISS uses do not have continuous data for every month. For example, see Bourges, France).

Table 1 below shows the top ten countries ordered by the WMO station locations they contribute to the GHCN data set. The figures below lump together multiple stations contributing to the same location (as identified by the WMO station number).

Table 1: Top ten countries by the number of stations in the GHCN
Country Number of WMO
station IDs in the GHCN
Percent of total
Total: 4,495 48.8%
United States of America 48110.7%
Australia 46310.3%
China3808.5%
Canada2325.2%
Japan1653.7%
Russian Federation (Asian Sector)1513.4%
Turkey 871.9%
Italy 831.9%
Germany 731.6%
Argentina 721.6%

That is, more than one-third of the WMO stations in the GHCN are located in five countries and about half of them are located in ten countries (see GISS Station Locations). Note that in the GHCN data set, there are some times multiple series corresponding to each WMO station identifier. I just lumped those together.

Well, what does this table show?

Surely, the U.S. is a small fraction of the global area. However, it contributes the most data to the GHCN data set (both in terms of number of locations and continuity of series).

As explained in GISS analysis of surface temperature change, the data from individual Stations located within 1200 km of the grid point are employed stations are then mapped to a grid with 2°x2° resolution. (p. 5).

Some simple implications follow: In geographical areas where there are a lot of records, all those records get aggregated into one series of grid temperature anomaly. On the other hand, there are places around the world where only a handful of locations contribute to the data in large areas. In addition, as noted by Hansen et al. in the same article, the number of stations has been declining since late 1990s.

To sum it all up, data from a station such as Tabuk, Saudi Arabia will be much more important in the calculations than data from areas with many more stations.

We can all agree that the recently discovered error in the GISS data set did not change global averages that much. In that regard, it was not a major error in magnitude. On the other hand, this error has further undermined my confidence in the underlying data that go into those calculations.

The effect of any errors that might exist in the data from the rest of the world could be even more significant than the error that was recently discovered in the GISS's handling of GHCN data. Unfortunately, we may never find out if there are errors in the data from various countries as their government institutions which keep data are much less open to scrutiny than NASA or NOAA.

Update: After posting this, I got curious and checked how many WMO station locations have data for January, 2007 in the GHCN data set. The number is 1064 excluding Antarctica and Ship Stations. This corresponds to 125 countries out of 225 in the data set. If we also exclude the United States, the number of locations with temperature data for January, 2007 go down to 930 out of a total of 3,974 with some data in the GHCN.

Another update: There are 7,086,372 monthly observations in the GHCN. Of these, 365,755 are missing values. Fully 2,244,733 non-missing monthly mean temperatures are from U.S. stations. That is, about a third of the data in the GHCN is from the U.S. Pretty good for a country which covers a small fraction of the globe.