How much data do we really have?

Richard Dixon
Oct 21, 2016
4 min read

Hurricane modelling following the active years of 2004 and 2005 was enlivened by the recognition of - and arguments around - the decadal see-sawing of the sea-surface temperatures in Atlantic Basin and its possible influence on hurricane frequency. However I'm not here to discuss this rather thorny issue given recent reductions in landfall activity, but rather use it as an example that highlights the uncertainty around the data we use, even in a "landfall-rich" area such as Florida where we would expect to have fairly good confidence around catastrophe model output.

The most commonly-used index to highlight the warm and cold fluctuations of sea-surface temperature in the Atlantic hurricane region is the Atlantic Multidecadal Oscillation (AMO) index. Using this index, we can identify 'warm' and 'cold' periods of the Atlantic Ocean and then look at the hurricane behaviour across history to understand how the warm sea surface temperatures modulate hurricane activity, specifically in their landfalling behaviour.

The chart below shows the AMO index (on the y-axis) since 1900. Warm and cold phases of the AMO are in red and blue, respectively. For example, the sea has typically been warmer than average since 1995.

The chart essentially shows how we can look at the AMO data using two different methods to diagnose warm sea surface temperature (which I'll shorten to WSST) years:

- "Years" Method: using the AMO index for each year to assign whether that year was a WSST year. In this data, we have looked at the average AMO across the hurricane season (see the black bars in the above chart)

- "Blocks" Method: using broad-scale trends to assign "blocks" of time when the sea was warm (see the red/blue regions in the above chart)

Both are feasible routes to selecting appropriate years that can be assigned as "warm". Using these two methods of assigning WSST years we can then compare the rate of landfalling hurricanes in these WSST years from these two methods to the long-term average.

I'll use the example of Florida. It has had numerous hurricane landfalls on which we've built hurricane models for more than 25 years now.

Let's explain the above graph. It effectively shows, using historical data going back to 1900, the increase in landfalling hurricane activity in WSST years based on whether you define WSST years in "blocks" or by the year.

For all hurricanes (the two left-hand columns), if you define WSST years by yearly AMO data, you actually see a marginal decrease in landfalling activity in WSST years. If you define the WSST years by blocks, you actually notice around a 15% increase in landfalling activity in WSST years.

The difference is more stark if we look at major hurricanes (the two right-hand columns): the "Years" method for assigning historical WSST years gives a 10% increase in Major hurricane landfalls in WSST years. The "Blocks" method however produces a larger increase in landfalls in WSST years - more like 35%. Quite a difference of opinion.

However, let's just take a look at the data that went into producing the above graph and go straight to the example of calculating major hurricane rates in WSST years:

You can see that depending on whether you define your WSST years by blocks or years, the rate of landfalling Major Hurricanes is either 0.33 (blocks) or 0.28 (years). Comparing these two numbers to the baseline of 0.25 major hurricane landfalls per year since 1900, this gives us the 35% or 10% increase in the baseline risk shown in the graph above.

However, look at the data behind these rates (in blue text). There are only 29 landfalling major hurricanes, and 20 or 16 of those (depending on WSST definition) occur in WSST years.

This difference in results from the two methods we could simply argue comes down to the fact that there really isn't a lot of data here.

This example of uncertainty using two similar methods arising from a lack of data is for a state with a relatively data-rich hurricane history where the average yearly insured loss potential is higher arguably than anywhere else on the planet. What can we do to understand better this source of uncertainty?

Naturally, we could wait another 50 years to obtain more hurricane data, observed the waxing and waning of the AMO and reduce this uncertainty to a certain extent. We can dig more into the data to try and find firmer statistical relationships - but when it comes down to it, we’ll still only have that same old 100-or-so years of data to draw upon. In an ideal world we’d have, say, 1000 years of hurricane history to be able to refer back to, to draw firmer conclusions and to identify climatological linkages to be able understand the risk better. But obviously this doesn’t exist.

Or does it?

The most high-powered climate models are being run in academic institutions at resolutions approaching those that can start to resolve hurricanes, albeit maybe currently not at the appropriate intensity: there is still some way to go here. The example below shows NOAA's GFDL HiFLOR model simulations to demonstrate the level of complexity we are starting to see as computing power continues to improve:

High-resolution climate models could soon start to offer us "surrogate" hurricane seasons to help understand better hurricane behaviour, for example, in warm SST years.

If the above landfalling hurricane study is anything to go by and we desire more understanding around hurricane season variability then simulated hurricane seasons in high-resolution climate models are seemingly the route our industry needs to invest in - and partner more closely with academia to understand better hurricane risk.

Do we really want to wait another 100 years to reduce the uncertainty in our hurricane models?