NOAA NASA Joint Archive (NNJA) of Observations for Earth System Reanalysis
About
A team of scientists from NOAA and NASA are collaborating on a curated joint observation archive containing Earth system data from 1979 to present. The goal is to foster collaboration across organizations and develop the ability for direct comparison of Earth System reanalysis results.
Providing a singular dataset for observation input use will allow reanalyses to be compared on their unique development qualities by removing the variation from using different observations.
About the dataset
The archive is hosted in an Amazon Web Services (AWS) S3 bucket with ocean, ice, land and atmosphere observations in formats including bufr, ioda, and netcdf. Some data is reprocessed as appropriate. The dataset also includes specification of observational errors and a black/white list for historic observations.
Observations are being uploaded daily for near real-time data, on a 72-hour delay. Additional data sources are still being considered and added to the archive as appropriate.
The number of observations for selected components of the Earth system over time can be seen below. Further details about the data contained in the archive can be seen in the Inventories tab, updated daily.
Additionally, a subset of the NNJA data is already being converted into AI-ready cloud optimized formats through a partnership with Brightband. NNJA-AI is available via Google Cloud Storage.
NNJA Observation Count for Selected Components
Future state
Future goals include expanding the observations types contained including additional sources of ocean data, conversion of data into ioda format, additional quality control of the dataset, aligning data availability with ongoing reanalysis development at NOAA and NASA, and the ongoing development of improved diagnostic tools for comparing experimental results with observational data.
In relationship to the NNJA-AI dataset, the goal is to continue providing and expanding NNJA data in AI accessible ways in the future.
We anticipate that the open dataset will be of value to the wider Earth System community, including future applications outside of the original goal of reanalysis production.
The end result of the project will be a publicly accessible dataset for observations and tools for consistent comparisons between reanalysis products.
How to cite
Example: NOAA NASA Joint Archive (NNJA) was accessed on [DATE] from https://psl.noaa.gov/data/nnja_obs/
Please go to the How to Cite tab for license and additional information.
Inventories
The following inventory plots are updated daily.
Notes
- While a majority of the data is represented in the above inventory, there are data not represented that are available. This includes some ocean and snow data.
- For inventory graphics that include a file path on the right side of the graphic, this represents the folder structure for locating the data in NNJA.
- For satellite data which is currently being assimilated in the ongoing NOAA reanalysis products, the inventory lines will appear in blue in accordance with the satellite black/white lists available on our data page.
- If a line is thick, black or blue, it shows that we have that data in our archive. If a line is thin and blue, it means our system will assimilate the data once it is added to the archive but is not currently located in the archive.
- For conventional sensors, more information about each typ can be found in the PREPBUFR code table.
Data Access
This guide explains how to locate and download data from the NOAA-NASA Joint Archive (NNJA) of Observations for Earth System Reanalysis public dataset on AWS S3 for independent researchers and the general public. If you are using data from NNJA, please refer to the How To Cite tab for licensing information and appropriate attribution.
Links on this page will open in a new browser tab.
Overview
The NNJA dataset is hosted as a public dataset on AWS S3. The associated black / whitelist is publicly available on Github. Anyone can access and download data without an AWS account or credentials or a Github account or credentials.
Data Locations
-
NNJA AWS S3 Bucket
- General bucket structure: noaa-reanalyses-pds/observations/reanalysis/{sensor}/{source(s)}/YYYY/MM/{file format}/{data files}
- Resource: How to download objects from AWS S3 buckets
-
Github Black/White List
- The information from the ‘satinfo’ files is reflected in our satellite based graphics on the Inventories tab.
-
NNJA-AI via GCS
- A subset of the NNJA archive has been converted into AI-ready cloud optimized formats through a partnership with Brightband, available via Google Cloud Storage.
Downloading NNJA Observation Data from AWS S3
Prerequisites
1. AWS CLI (Command Line Interface)
This is the main tool you need to download data from S3.
Check if installed:
aws --version
Installation by Operating System:
Linux:
# Option 1: Using pip (recommended)
pip install awscli --user
# Option 2: Using pip3
pip3 install awscli --user
# Option 3: AWS installer (AWS CLI v2)
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
macOS:
# Using Homebrew (easiest)
brew install awscli
# Or using pip
pip3 install awscli --user
# Or AWS installer
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
sudo installer -pkg AWSCLIV2.pkg -target /
Windows:
- Download the installer: https://awscli.amazonaws.com/AWSCLIV2.msi
- Run the downloaded MSI installer
- Or use pip:
pip install awscli
2. Python
AWS CLI requires Python (2.7+ or 3.4+, but 3.6+ recommended)
Check if installed:
python --version
# or
python3 --version
What You DO NOT Need
Since this is a public dataset, you do NOT need:
- ❌ AWS account
- ❌ AWS credentials/access keys
- ❌ Credit card or payment method
- ❌ VPN or special network access
Quick Setup Test
Run these commands to verify your setup:
# 1. Check AWS CLI is installed
aws --version
# 2. Test access to the public bucket
aws s3 ls s3://noaa-reanalyses-pds/observations/reanalysis/ --no-sign-request
# 3. If both work, you're ready to download!
Usage Examples
Download a Single File
aws s3 cp \
s3://noaa-reanalyses-pds/observations/reanalysis/ozone/nasa/mls/2004/08/netcdf/MLS-v5.0-oz.20040801_00z.nc \
. \
--no-sign-request
List Available Files
# List files in a specific directory
aws s3 ls \
s3://noaa-reanalyses-pds/observations/reanalysis/ozone/nasa/mls/2004/08/netcdf/ \
--no-sign-request
Download Multiple Files (Entire Directory)
# Download all files from a specific month
aws s3 sync \
s3://noaa-reanalyses-pds/observations/reanalysis/ozone/nasa/mls/2004/08/netcdf/ \
./mls_2004_08/ \
--no-sign-request
Download Recursively
# Copy all files recursively
aws s3 cp \
s3://noaa-reanalyses-pds/observations/reanalysis/ozone/nasa/mls/2004/08/netcdf/ \
. \
--recursive \
--no-sign-request
Important Notes
- Always include
--no-sign-request: This flag is essential for accessing public datasets without AWS credentials - Data transfer costs: Downloading data from AWS S3 is free for the user (AWS covers egress costs for public datasets)
- File sizes: NetCDF files can be large; check available disk space before downloading multiple files
Troubleshooting
"Command not found: aws"
- AWS CLI is not installed or not in your PATH
- Try reinstalling or adding to PATH:
export PATH=$PATH:~/.local/bin
"InvalidAccessKeyId" error
- You forgot to include
--no-sign-requestflag - Add the flag to bypass credential requirements
Connection timeout
- Check your internet connection
- Verify firewall allows HTTPS traffic on port 443
Additional Resources
Data Sources
Select a button to see list of variables for that category linked to more information on their respective source. All links external
Atmosphere- adt
- airs
- amsua
- amsub
- amv
- atms
- avhrr
- AVHRR
- AVHRR/2
- AVHRR/3
- conv
- cris
- geo
- IMAGER (GOES 8-11)
- IMAGER (GOES12-15)
- SOUNDER
- gmi
- gps
- ROM SAF Metop
- ROM SAF COSMIC
- ROM SAF GRACE
- ROM SAF CHAMP
- GRAS
- IGOR (COSMIC)
- BlackJack (GRACE)
- BlackJack (CHAMP)
- hirs
- HIRS/2
- HIRS/3
- HIRS/4
- iasi
- mhs
- msu
- ozone
- GOME
- GOME-2
- OMPS-limb
- OMPS-nadir
- MLS
- MLS (EOS-Aura)
- OMI
- SBUV
- SBUV/2
- MLS NRT
- OMI NRT
- OMPS NRT
- saphir
- seviri
- ssu
- trmm
- Concentration
- Free board
How to cite
NOAA NASA Joint Archive (NNJA) of Observations for Earth System Reanalysis data is distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
To provide appropriate attribution under this license, please cite the link to this PSL website.
For example: NOAA NASA Joint Archive (NNJA) was accessed on [DATE] from https://psl.noaa.gov/data/nnja_obs/
Additional Acknowledgements by stream
NNJA also hosts data sourced from external partners with their own CC by 4.0 license. To provide appropriate attribution for these datasets, please also include the below acknowledgements when using data located in the following folders. Further license details and sources for these data are provided via the README within the folder of each datasets in NNJA.
/ssmi/eumetsat/ and /ssmis/eumetsat/
Acknowledgement: The work performed was done (i.a.) by using data from EUMETSAT’s Satellite Application Facility on Climate Monitoring (CM SAF) Fennig, Karsten; Schröder, Marc; Hollmann, Rainer (2017): Fundamental Climate Data Record of Microwave Imager Radiances, Edition 3, Satellite Application Facility on Climate Monitoring, DOI:10.5676/EUM_SAF_CM/FCDR_MWI/V003, https://doi.org/10.5676/EUM_SAF_CM/FCDR_MWI/V003.
Please also note the following disclaimer: EUMETSAT offers no warranty and accepts no liability in respect of the SAF on Climate Monitoring (CM SAF) products. EUMETSAT neither commits to nor guarantees the continuity, availability, or quality or suitability for any purpose of, the CM SAF products.
/gps/eumetsat/
Acknowledgement: EUMETSAT ROM SAF Radio Occultation Climate Data Record (v1.0, 2019), GRM-29-R1, DOI:10.15770/EUM_SAF_GRM_0002, https://doi.org/10.15770/EUM_SAF_GRM_0002.
/amv/merged/
Acknowledgement: EUMETSAT (2021): Atmospheric Motion Vectors Climate Data Record Release 2 - MFG and MSG - 0 degree, European Organisation for the Exploitation of Meteorological Satellites, DOI: 10.15770/EUM_SEC_CLM_0020. http://doi.org/10.15770/EUM_SEC_CLM_0020
/sst/esacci/
Acknowledgement:Embury, O., Merchant, C.J., Good, S.A., Rayner, N.A., Høyer, J.L., Atkinson, C., Block, T., Alerskans, E., Pearson, K.J., Worsfold, M., McCarroll, N., Donlon, C., (2024). Satellite-based time-series of sea-surface temperature since 1980 for climate applications. Sci Data 11, 326. doi: https://doi.org/10.1038/s41597-024-03147-w