Existing observation operators¶
CHEEREIO ships with observation operators that have produced by the community. Users can add their own operators by following the instruction on the Workflow to add a new observation operator page, or they can use the operators listed below.
TROPOMI tools¶
The TROPOspheric Monitoring Instrument (TROPOMI) onboard Sentinel-5 Precursor satellite satellite measures criteria air pollutants and other trace gases of interest to CHEEREIO users. Currently, there are two TROPOMI operators written for CHEEREIO: CH4and CO (the latter contributed by Sina Voshtani). Users can follow the pattern in tropomi_tools.py
to add support for additional species. An NO2operator is partially written but not yet functional as of version 1.2.
To activate TROPOMI observations, list “TROPOMI” as an observation type in the OBS_TYPE
setting, as described on the Observation settings page.
TROPOMI fields available to save and plot in CHEEREIO¶
All observation operators save standard data from observations and GEOS-Chem and pass it on to CHEEREIO (including latitude, longitude, observation values, GC simulated observation values, and timestamps). However, individual operators might also be able to save additional data, such as albedo in remote sensing cases or site IDs in surface observation cases. Users can list the additional data they would like to save in the EXTRA_OBSDATA_FIELDS_TO_SAVE_TO_BIG_Y
setting in ens_config.json
, following the instructions on the Postprocessing settings page. With the TROPOMI operator, all data read in by the read_tropomi function are available to be saved and/or plotted. Below is a subset of the supported fields which users can save in TROPOMI:
-
qa_value
¶
Quality assurance value.
-
column_AK
¶
Satellite averaging kernel.
-
albedo_swir
¶
Shortwave infrared albedo.
-
albedo_nir
¶
Near infrared albedo.
-
blended_albedo
¶
Blended albedo.
-
methane_profile_apriori
¶
A priori methane profile. (Methane only)
TROPOMI operator support functions¶
The TROPOMI observation operator calls a handful of utility functions to process observations from file and format them in such a way that the gcCompare
method of the TROPOMI_Translator
class can perform relevant computations.
The first two utility functions are designed to remap GEOS-Chem pressure levels to satellite pressure levels and apply the averaging kernel.
-
GC_to_sat_levels
(GC_SPC, GC_edges, sat_edges, species, chunk_size=10000)¶ Takes as input GEOS-Chem data and pressure level edges, as well as satellite pressure levels, and calculate GEOS-Chem data values on satellite pressure levels
- Parameters
GC_SPC (array) – A NumPy array containing GEOS-Chem columns.
GC_edges (array) – A NumPy array containing GEOS-Chem pressure level edges.
sat_edges (array) – A NumPy array containing TROPOMI pressure level edges
species (str) – Species to be processed.
chunk_size (int) – For CO, the number of observations to be processed at once. This is to save memory for TROPOMI observations with high vertical resolution.
- Returns
A NumPy array containing GEOS-Chem columns remapped to be on the TROPOMI pressure levels.
- Return type
array
-
apply_avker
(sat_avker, sat_pressure_weight, GC_SPC, sat_prior=None, filt=None)¶ Apply the averaging kernel
- Parameters
sat_avker (array) – TROPOMI averaging kernel.
sat_prior (array) – The satellite prior profile in ppb, optional (used for CH4).
sat_pressure_weight (array) – The relative pressure weights for each level
GC_SPC (array) – The GC species on the satellite levels, output by GC_to_sat_levels
filt (array) – A filter, optional
- Returns
A NumPy array containing simulated GEOS-Chem values such that they are directly comparable to TROPOMI (i.e. because the averaging kernel has been applied).
- Return type
array
The remaining three utility functions are very similar. They read TROPOMI level 2 observations from file, but for methane there are a variety of TROPOMI observations with a variety of formattings (operational, science product, Harvard-specific standard). Each function is designed to read a different formatting. Users can select which function they would like to use by specifying the WHICH_TROPOMI_PRODUCT
setting in the TROPOMI_CH4_extension.json
file. DEFAULT
selects the TROPOMI operational product, ACMG
for the ACMG/Harvard TROPOMI product, and BLENDED
for Belasus et al., 2023 which also works for the TROPOMI science product. For CO, users must select the DEFAULT
option.
-
read_tropomi
(filename, species, filterinfo=None, includeObsError=False)¶ Designed for the TROPOMI operational level 2 product. A utility function which loads TROPOMI observations from file, filters them, and returns a dictionary of important data formatted for input into the
gcCompare
method of theTROPOMI_Translator
class. This function is selected when users supply theDEFAULT
value to theWHICH_TROPOMI_PRODUCT
setting in theTROPOMI_CH4_extension.json
file.- Parameters
filename (str) – NetCDF file containing TROPOMI observations to be loaded. Expects standard level 2 data.
species (str) – Name of species to be loaded. Currently, only “CH4” is supported, though an “NO2” operator is partially written.
filterinfo (dict) – A dictionary of information about data filtering which is passed to a standard observation operator utility function. See (6) [optional] Add observation filters via an extension for more information
includeObsError (bool) – True or False, read the errors associated with individual observations.
- Returns
A dictionary containing observation values and metadata, ready for input into the
gcCompare
method of theTROPOMI_Translator
class.- Return type
dict
-
read_tropomi_acmg
(filename, species, filterinfo=None, includeObsError=False)¶
Designed for the Harvard/ACMG version of the TROPOMI operational level 2 product. A utility function which loads TROPOMI observations from file, filters them, and returns a dictionary of important data formatted for input into the
gcCompare
method of theTROPOMI_Translator
class. This function is selected when users supply theACMG
value to theWHICH_TROPOMI_PRODUCT
setting in theTROPOMI_CH4_extension.json
file.- Parameters
filename (str) – NetCDF file containing TROPOMI observations to be loaded. Expects standard level 2 data.
species (str) – Name of species to be loaded. Currently, only “CH4” is supported, though an “NO2” operator is partially written.
filterinfo (dict) – A dictionary of information about data filtering which is passed to a standard observation operator utility function. See (6) [optional] Add observation filters via an extension for more information
includeObsError (bool) – True or False, read the errors associated with individual observations.
- Returns
A dictionary containing observation values and metadata, ready for input into the
gcCompare
method of theTROPOMI_Translator
class.- Return type
dict
-
read_tropomi_gosat_corrected
(filename, species, filterinfo=None, includeObsError=False)¶
Designed for the Belasus et al., 2023 version of the TROPOMI operational level 2 product, but also works for the TROPOMI science product. A utility function which loads TROPOMI observations from file, filters them, and returns a dictionary of important data formatted for input into the
gcCompare
method of theTROPOMI_Translator
class. This function is selected when users supply theBLENDED
value to theWHICH_TROPOMI_PRODUCT
setting in theTROPOMI_CH4_extension.json
file.- Parameters
filename (str) – NetCDF file containing TROPOMI observations to be loaded. Expects standard level 2 data.
species (str) – Name of species to be loaded. Currently, only “CH4” is supported, though an “NO2” operator is partially written.
filterinfo (dict) – A dictionary of information about data filtering which is passed to a standard observation operator utility function. See (6) [optional] Add observation filters via an extension for more information
includeObsError (bool) – True or False, read the errors associated with individual observations.
- Returns
A dictionary containing observation values and metadata, ready for input into the
gcCompare
method of theTROPOMI_Translator
class.- Return type
dict
OMI tools¶
NASA’s Ozone Monitoring Instrument (OMI) onboard the Aura satellite measures criteria air pollutants and other trace gases of interest to CHEEREIO users. Currently, NO2is the only OMI operator written for CHEEREIO, but users can follow the pattern in omi_tools.py
to add support for additional species.
To activate OMI observations, list “OMI” as an observation type in the OBS_TYPE
setting, as described on the Observation settings page.
OMI fields available to save and plot in CHEEREIO¶
All observation operators save standard data from observations and GEOS-Chem and pass it on to CHEEREIO (including latitude, longitude, observation values, GC simulated observation values, and timestamps). However, individual operators might also be able to save additional data, such as albedo in remote sensing cases or site IDs in surface observation cases. Users can list the additional data they would like to save in the EXTRA_OBSDATA_FIELDS_TO_SAVE_TO_BIG_Y
setting in ens_config.json
, following the instructions on the Postprocessing settings page.
Currently, no additional ObsData fields are available to be saved and/or plotted with the OMI operator as written. Additional fields can be added by saving more metadata into the ObsData object with the addData
function within the gcCompare
method of the OMI_Translator
class. See the Observations page for more details.
OMI operator support functions¶
The OMI observation operator calls two utility functions to process observations from file and format them in such a way that the gcCompare
method of the OMI_Translator
class can perform relevant computations. They are documented below:
-
read_omi
(filename, species, filterinfo=None, includeObsError=False)¶ A utility function which loads OMI observations from file, filters them, and returns a dictionary of important data formatted for input into the
gcCompare
method of theOMI_Translator
class.- Parameters
filename (str) – NetCDF file containing OMI observations to be loaded. Expects standard level 2 data.
species (str) – Name of species to be loaded. Currently, only “NO2” is supported.
filterinfo (dict) – A dictionary of information about data filtering which is passed to a standard observation operator utility function. See (6) [optional] Add observation filters via an extension for more information
includeObsError (bool) – True or False, read the errors associated with individual observations.
- Returns
A dictionary containing observation values and metadata, ready for input into the
gcCompare
method of theOMI_Translator
class.- Return type
dict
-
clearEdgesFilterByQAAndFlatten
(met)¶ A utility function takes in partially formatted OMI data and does additional processing, outputting a flattened set of arrays which are compatible with CHEEREIO. In the process, the function removes swath edges and bad retrieval values.
- Parameters
met (dict) – A dictionary with keys naming important observation data and metadata, and values of raw 2D swath data from OMI.
- Returns
A dictionary containing flattened observation values and metadata, with bad data removed, ready for input into the gcCompare` method of the
OMI_Translator
class.- Return type
dict
ObsPack tools¶
ObsPack <https://doi.org/10.5194/essd-6-375-2014>`__is a standardized dataset containing measurements from surface monitors distributed around the world, aimed at carbon cycle studies. CHEEREIO users often use ObsPack data for validation of CO or CH4 inversions, or as observations for the inversion itself. The ObsPack observation operator, contained in the ``obspack_tools.py` file, wraps around the the ObsPack diagnostic produced by GEOS-Chem and translates it into a form acceptable to CHEEREIO.
To activate ObsPack, see the Observation settings page for information on the correct settings for ens_config.json
.
ObsPack fields available to save and plot in CHEEREIO¶
All observation operators save standard data from observations and GEOS-Chem and pass it on to CHEEREIO (including latitude, longitude, observation values, GC simulated observation values, and timestamps). However, individual operators might also be able to save additional data, such as albedo in remote sensing cases or site IDs in surface observation cases. Users can list the additional data they would like to save in the EXTRA_OBSDATA_FIELDS_TO_SAVE_TO_BIG_Y
setting in ens_config.json
, following the instructions on the Postprocessing settings page. Below are a list of the supported fields which users can save in ObsPack:
-
utc_conv
¶
A conversion constant that can change UTC timestamps into local time.
-
altitude
¶
Altitude of ObsPack site.
-
pressure
¶
Pressure observed at ObsPack site.
-
obspack_id
¶
Unique identifier of obspack observation.
-
platform
¶
Obspack platform.
-
site_code
¶
Unique identifier of obspack site.
Users wishing to aggregate ObsPack results by site, either for plotting or other analysis, will want to save site_code
. Follow the instructions on the Postprocessing settings to ensure this is done successfully.
Additional fields can be added by saving more metadata into the ObsData object with the addData
function within the gcCompare
method of the ObsPack_Translator
class. See the Observations page for more details.
ObsPack preprocessing functions¶
CHEEREIO has built in preprocessing functions to translate raw ObsPack data, as downloaded from NOAA, into a form compatible with the GEOS-Chem ObsPack diagnostic. To use this functionality, set preprocess_raw_obspack_files
to true
in ens_config.json
and provide a path to the raw files in the raw_obspack_path
entry. However, some users report that NOAA ObsPack data is not quite standardized. If you run into preprocessing errors, you should set preprocess_raw_obspack_files
to false
and supply an already populated directory of manually preprocessed files. Details for how to do this are provided in the ObsPack diagnostic <https://geos-chem.readthedocs.io/en/stable/gcclassic-user-guide/obspack.html> documentation for GEOS-Chem; feel free to use the code provided in CHEEREIO as a model. See the Observation settings page for more information on ensemble configuration settings for ObsPack.
Descriptions of the ObsPack preprocessing functions are below.
-
make_filter_fxn
(start_date, end_date, lat_bounds=None, lon_bounds=None)¶ Generate a function that will filter raw ObsPack data (i.e. downloaded directly from NOAA) and keep only data within certain date and location bounds. The output filter function also does additional filtering and reformatting regardless of these bounds.
- Parameters
start_date (datetime) – Date of earliest ObsPack data to include
end_date (datetime) – Date of latest ObsPack data to include
lat_bounds (list) – If filtering by latitude, a list of two latitudes representing minimum and maximum latitude to be kept. If None, ignore.
lon_bounds (list) – If filtering by longitude, a list of two longitudes representing minimum and maximum longitudes to be kept. If None, ignore.
- Returns
A filter function for filtering and formatting raw ObsPack data.
- Return type
Function
-
prep_obspack
(raw_obspack_dir, gc_obspack_dir, filename_format, start_date, end_date)¶ This is a preprocessing function, designed to take raw ObsPack files as downloaded from NOAA and process them into files compatible with CHEEREIO and the GEOS-Chem ObsPack diagnostic.
- Parameters
raw_obspack_dir (str) – Directory where raw ObsPack data as downloaded from NOAA is stored. CHEEREIO takes this by default from the
raw_obspack_path
path inens_config.json.
gc_obspack_dir (str) – Directory where processed ObsPack compatible with the GEOS-Chem ObsPack diagnostic will be saved. CHEEREIO takes this by default from the
gc_obspack_path
path inens_config.json.
filename_format (str) – File format which CHEEREIO will use to save the preprocessed ObsPack data. CHEEREIO takes this by default from the
obspack_gc_input_file
entry inens_config.json.
start_date (datetime) – Date of earliest ObsPack data to include. CHEEREIO takes this by default from the
START_DATE
entry inens_config.json.
end_date (datetime) – Date of latest ObsPack data to include. CHEEREIO takes this by default from the
END_DATE
entry inens_config.json.