Joining Daymet Tiles with NCL
Introduction to Daymet Gridded Data
The Daymet data set provides high resolution gridded estimates of daily weather parameters for North America. Included are daily continuous surfaces of minimum and maximum temperature, precipitation occurrence and amount, humidity, shortwave radiation, snow water equivalent, and day length.
Daymet is archived and distributed by the Distributed Active Archive Center of Oak Ridge National Laboratory. Please see the Daymet website for data description, format documentation, and data access:
Original Daymet daily gridded data are distributed in small NetCDF files covering North America in "tiles" of 2 by 2 degrees, and one calendar year per file. The NetCDF files can be efficiently downloaded in compressed tar files, all years for one tile, via the map interface on this access page:
Daymet files obtained this way are identified by subdirectory name and file name. The subdirectory name includes the tile ID number and the calendar year. Within each subdirectory, NetCDF files are named after the individual data variables or fields: prcp.nc, tmin.nc, etc.
Other access methods are available, including THREDDS data server. They may result in file hierarchies and naming conventions that differ from what is described here. However, the programs below can be adapted as needed.
Data arrays within Daymet files are on a Lambert conformal conic projection centered over North America. In NCL documentation, this is called a curvilinear grid or a native grid. This grid is supported by a dual set of coordinates within Daymet files. These consist of 1-D projection coordinates x and y in meters, and 2-D terrestrial coordinates lat and lon in degrees.
In the NCL plotting methods shown here, it is the 2-D coordinates that are used to display the data in the correct locations over the base map. The projection information and the 1-D coordinates are not used for making plots. However, the 1-D coordinates are essential for the join script. The join script aligns tiles by their 1-D X and Y coordinates, not by any 2-D coordinates.
The Daymet data arrays are padded around the edges of X and Y with triangular areas of missing values. These areas are the difference between 2 x 2 degree rectangles in latitude and longitude coordinates, and the enclosing areas on the curvilinear Daymet grid.
This figure shows one original Daymet tile over northwest Wyoming (United States). Note that the map itself is on a cylindrical equidistant projection. but the Daymet data grid is curvilinear. The padding areas of missing values are also displayed in light purple. The outer bounds of the padding areas are the edges of the enclosing rectangle on the Daymet data grid. The appearance of straight edges around the actual non-missing data is an illusion due to the inserted padding areas. The padding or buffer areas are explained further on page 5 and figure 2 in this Daymet user guide:
Daymet tiles over coastlines and islands typically have data coverage that is much smaller than the 2 x 2 degree boundaries. Daymet reduces the actual X/Y grids for each tile to the minimum needed for grid points over land only.
The following NCL script and list file displays the 12 original tiles for Wyoming, like the one above, in successive frames in an X11 window. See usage instructions at the top of the NCL script. The user must download their own original Daymet files as described above.
Spatially joining Daymet tiles is complicated because of overlaps, skewed alignment, padding areas, and missing coordinates. Here is an NCL script that will correctly join or "mosaic" a number of Daymet NetCDF files into a single large NetCDF file. The method is to determine offset subscripts, then overlay tile input arrays into the correct positions within a properly sized output array. This version also uses a list file in a slightly different format, to specify the tile ID numbers to be joined:
This version requires a supplemental file with complete 2-D coordinate grids over the North America domain. This file can be downloaded from the Community Tools page on the Daymet website. This file is needed to fill in gaps in 2-D coordinates around the edges and in between joined tiles. Complete 2-D coordinates are essential for correct operation of NCL graphics routines and other plotting software.
The join result for the 12 tiles over Wyoming is shown here. Once again, this figure is enhanced to show the padding areas around the edges in light purple. Note that the coordinates for large portions of the padding areas are not present in original input files. These missing coordinates are obtained from the supplemental coordinate file. Compare these padding areas with those in the previous figure for only one tile:
This plot was made by the following NCL script and demonstration data file. You can modify this script and use it to check your own merge results. Remember to change the embedded map boundary coordinates to include your own set of tiles:
- prcp.wyoming.jan25.nc (10 Mb, single grid for one date, extracted from the merged NetCDF file)
With this version of the join script, attempting to join too many Daymet tiles may result in NCL memory overflow. This will probably show up as a malloc error and program abort. Of course, this is also a function of your computer memory size. One user reports memory overflow at somewhere between 45 and 77 tiles (thanks to Kamal M. for this report).
Possible solutions include the following. Several have the same thing in common, reducing the size of the in-memory copy of the super array.
- First, check for error in the list of tiles. The super array is allocated as an X/Y rectangle that contains all grid points in all specified tiles. If you have a rogue tile outside of your expected area, this can cause a huge increase in requested memory, and program crash.
- Loop over time, process one time step at a time.
- Reduce the number of time steps by adding subscripting on the time dimension.
- Install more computer memory.
- Move to a computer with more memory.
- Install a better virtual memory manager.
- Improve settings for virtual memory manager.
- Do not use a super array in memory. Change the script to perform overlay operations directly on the Netcdf file variable.
- Reduce the number of tiles included.
NCL (NCAR Command Language) is a free programming language with strong NetCDF support, created and distributed by The National Center for Atmospheric Research: