El Niño and Probability

Prashant Sardeshmukh, Gilbert Compo, and Cécile Penland
NOAA/ESRL Physical Sciences Laboratory

Science Writer: Susan Bacon
University of Colorado

Benefits of Large Datasets

The importance of using a large data set can be illustrated in a simple study of coin tosses. Any honest coin has a 50-50 chance of landing on heads or tails. This means that if a coin is flipped 100 times, it's likely that about 50 of the flips will show heads, and about 50 will show tails. With 100 flips, the coin-tosser will have a good chance of predicting a 50 percent chance (probability) of getting a heads or a tails on any one of the flips. But if the coin is flipped only a few times, this 50-50 split may be harder to achieve. It's easy to imagine that 5 tosses could produce 1 heads and 4 tails, or 3 heads and 2 tails, or all heads and no tails. Drawing a conclusion about the probability of getting a heads or tails on any toss based on only 5 flips - a small sample size - can lead to inaccurate results.

Similar challenges exist when using a small data set to understand the effects of ENSO [a phenomenon in which sea surface temperatures in the Eastern Pacific Ocean become higher or lower than the long-term average and cause changes that reverberate through the atmosphere perturbing weather patterns] on a system as complex as climate. But now, with increasingly sophisticated models and computer power, researchers don't have to rely only on the instrumental record. The climate science community can also create large data sets (ensembles) to more accurately identify the links between ENSO and resulting weather patterns.

Next