NoTricksZone reader Indomitable Snowman, a scientist who wishes to remain anonymous, has submitted an analysis of June temperature in Germany measured by the DWD German Weather Service.
German June Temperature Data – Statistical Analysis
By: The Indomitable Snowman
Recently, Pierre posted an article that included temperature data for the month of June in Germany. By inspection the sequential plot of the data appeared to show no long-term trend of any sort. However, more insight can be gained via quantitative statistical analysis.
Using the tabulated data (graciously provided by Josef Kowatsch) for the sequential years 1930 – 2015 (86 data points in total), it is readily found that the mean value of the data set is 15.7, while the standard deviation is 1.20.
Using the data points and the above information, it is a simple task to construct a “trend chart” – in which the data points are plotted, but with the inclusion of horizontal lines for the mean value, +/- one standard deviation, and +/- three standard deviations:
The trend chart clearly shows that the “system” is statistically well-behaved, with the points clustering close to the +/- one-standard-deviation band – indicating that there is no secular change in the underlying system over the time span, and that the variability about the mean can be solely attributed to statistical fluctuations. (It is also a well-known problem that when such a stable system is sequentially sampled, apparent-but-phantom “trends” will seem to appear; these can be seen in the plot, but they are not meaningful – they are artifacts of the sequential sampling.)
The proper way to group the data in such a system is grouping by standard deviations – something that is best-presented in a simple histogram:
The center bar is the number of occurrences within one standard deviation of the mean, while the other bars (moving outward) show the number of occurrences from one to two standard deviations (plus and minus), etc. Even though the number of data points is relatively small for the emergence of complete statistical behavior (i.e., the system is undersampled), a Gaussian profile (expected of a system that has a stable mean and statistical fluctuations about that mean) is clearly discernable.
Further, simple Gaussian statistics would predict the following number of occurrences for a Gaussian system with 86 data points – 54 within one standard deviation of the mean, 30 between one and two standard deviations, and 2 beyond two standard deviations. The actual numbers from the data are 55, 28, and 3 – remarkably close to the simple Gaussian expectations, even though the system has been undersampled.
Statistical analysis of the data indicates that the system in question has been stable over the entirety of the sampling period (1930 – 2015) and is not changing. In particular, the system, even though undersampled, produces results that are in almost exact agreement with expected results for a system that is stable around a central mean – with the variability between individual samples being entirely attributable to simple statistical variability.