This document describes variable labels and file formatting (FP-In) for uploading continuously sampled data to AmeriFlux and the European Fluxes databases.
Use these instructions to prepare BASE-In file(s) containing data that are continuously sampled at half-hourly or hourly intervals* for a certain period of time (e.g., a month, a year). Note: data files must have the same time interval between any two sequential values.
We refer to general formats described in Data Variables and add additional instructions specific to uploading for:
- Data processing
- Temporal representativeness and timestamps
- File format and content
- Data Variable: Base names
- Data Variable: Qualifiers
* Contact [email protected] if you need to upload data reported at other intervals.
1. Data Processing
Some data processing is necessary before uploading half-hourly / hourly fluxes and meteorological data to the network. Please follow the guidelines below to ensure generation of derived data products by the network. Provide data processing information in the BADM Instrument Ops template in an INSTOM_COMMENT associated with an INSTOM_VARIABLE_H_V_R entry.
1.1 Data Quality Control
Apply quality control to the variables based on assessment by the tower team. This includes removal of data points with bad data (e.g., from sensor failures or applying physical thresholds). The only exceptions to this guideline are listed below. Note: support for tower team generated QC flags is being developed but is currently not supported.
1.2 USTAR Filtering
Do not apply USTAR filtering to flux variables. The network team will use standardized methods to compute and apply USTAR thresholds for each site. The only exception is for gap-filled data (see below).
Non-gap filled data (without USTAR filtering1) must be provided. In addition, gap-filled versions of the data can be provided in the upload.
Gap-filled data must be identified using the _F variable qualifier (see Data Variable: Qualifiers). Please also provide documentation describing the gap-filling method in the BADM Instrument Ops template.
1 If applicable to the variable being gap-filled.
2. Temporal representativeness and timestamps
Follow the general instructions described in Data Variables, with the following reminders and specific upload requirements:
- A data file must contain the same time interval throughout the file. Make separate files to upload data reported at different time intervals.
- Use TIMESTAMP_START and TIMESTAMP_END with the YYYYMMDDHHMM format.
sample half-hourly data file:
TIMESTAMP_START,TIMESTAMP_END,CO2,... 201507281700,201507281730,391.1,... 201507281730,201507281800,391.8,... ...
sample hourly data file:
TIMESTAMP_START,TIMESTAMP_END,CO2,... 201507281700,201507281800,391.1,... 201507281800,201507281900,391.8,... ...
- Always put TIMESTAMP_START and TIMESTAMP_END as the first two columns.
- Use local standard time without Daylight Saving Time. Specify time zone using the Site General Information BADM2 for the site.
- Include data for all days in a leap year.
- Report missing data using -9999 as the replacing value.3
2 Biological, Ancillary, Disturbance, and Metadata (see BADM Templates).
3 Other values such as -6999, N/A, or NaN are not acceptable as indication of a missing value for any reason.
3. File format and content
3.1 File structure
Format each file to be uploaded as an ASCII4 text file using a CSV (comma-separated values) format, i.e., a tabular text format using a comma character to separate values.
Start each submitted file with a row of variable names. No variable name should contain blank spaces. Do not use surrounding quotes. Do not include additional header rows or a row of variable units (See section 4.3 Units).
A point should be used as a numeric decimal separator (as opposed to using commas); this is to avoid conflict with commas used for the CSV format.
TIMESTAMP_START,TIMESTAMP_END,CO2,H2O,FC,... 200210070600,200210070630,375.0026343,13.81902137,2.225711711,... 200210070630,200210070700,375.6178651,13.81904135,1.611090395,... 200210070700,200210070730,375.1484745,13.77998531,1.11762877,... 200210070730,200210070800,374.0334503,13.73454349,0.236125726,... ...
4 Note that using an UTF-8 encoding and using only the variable labels defined in this document and numeric values will automatically result in an ASCII file, and thus will be compatible.
Format the filename for file uploads as follows (note: csv file extension):
<SITE_ID>: Use the AmeriFlux / Fluxnet Site ID in the form CC-AAA. CC is the country code (e.g., US, CA, etc). AAA is the three alphanumeric characters associated with the site. The site ID is determined as part of the site registration process.
<RESOLUTION>: The time interval used throughout the file. Allowed resolutions are HH (for half-hourly) or HR (for hourly). If you need to upload data at a resolution other than half-hourly or hourly, please contact us at [email protected].
<TS-START>: The timestamp for the file’s earliest data in format of YYYYMMDDHHMM. It is the same as the first entry in the TIMESTAMP_START column.
<TS-END>: The timestamp of the last data entry in format of YYYYMMDDHHMM. It is the same as the last entry of the TIMESTAMP_END column.
<OPTIONAL>: A parameter to indicate additional information. This parameter will be removed from the filename upon processing.
3.3 File contents
Timestamps must be continuous. Files can have their first (or last) timestamp be any time, e.g., start (end) mid-year. However, files should not have missing timestamps in the middle of the file. For example, if all variables are missing for an entire week, the data variables should be set to -9999 for the entire week, while the timestamp variables should have valid values.
Data of different time periods can be uploaded using separate files (e.g., months, years). AmeriFlux suggests that data files typically contain at least 3 months or an entire year of data, as well as all variables measured at the site.
If an incomplete data record for the site is uploaded, newly uploaded data may be merged with previous data, if needed, to create a complete site record. For example:
- If the entire data time period previously submitted is re-uploaded, the newly uploaded data will be processed and will replace the previously submitted and processed data for that entire time period.
- If the uploaded data contains new data or has only partial overlap with the existing data, the newly uploaded data will be added to the previously submitted and processed data. New data will replace existing data in the case of overlap.
If you are sending only a portion of your site’s data and do not want the uploaded data merged with previously submitted and processed data, please contact [email protected].
4. Data Variable: Base names
Variables indicate fundamental quantities that are either measured or calculated / derived. They can also indicate quantified quality information.
4.1 Standard Variable Base names
Use the Data Variable: Base names specifications as described in the Data Variables documentation, with the following exception:
- The only accepted Timekeeping variables are TIMESTAMP_START and TIMESTAMP_END.
4.2 New Variable Base names
Include measurements not currently described in the standardized list of variables (Data Variable: Base names) in the same file containing standard variables. The non-standard variables will be saved for publication in future.
Please also contact us at [email protected] so that we can start the process of agreeing on variable base names and units so they can be included in the standardized list.
Convert data to the units described in the Data Variable: Base names description before uploading. The network will not do conversions. Notably:
- Please use percentages for ALB, RH, SWC, LEAF_WET and for most variables that could be reported in fractions or percentages.
- Please double-check the units of H2O_SIGMA, CO2_SIGMA, CH4, FCH4, VPD, D_SNOW, WTD.
4.4 Sign conventions
Please follow the sign conventions specified in the Data Variable: Base names description. Notably:
- For all turbulent flux variables (e.g., gas, heat, momentum flux), a negative value indicates net flux of matter, energy, or momentum from the atmosphere to the ecosystem (flora and fauna). The same convention applies to NEE. GPP and RECO are always positive values, where NEE = RECO – GPP.
- For gas and heat storage fluxes, a positive value indicates net increase of matter or energy storage within the ecosystem.
- For NETRAD, a positive value indicates net energy input from the atmosphere to the ecosystem.
- For RUNOFF, a positive value indicates a net outflow from the ecosystem.
- For G (soil heat flux), a positive value indicates net energy input from the ecosystem to the deep soil.
- For WTD (water table depth), a positive value indicates water level above the ground surface, vice versa.
5. Data Variable: Qualifiers
Qualifiers are suffixes appended to variable base names that provide additional information about the variable. Use the same data variable qualifier specifications as described in the Data Variables, with the following modifications:
- The only general qualifier accepted in uploaded data is _F (Gap-filled). Describe the gap-filling method applied in the BADM Instrument Ops template (see Section 1. Data Processing)
- Positional qualifiers are accepted as described in the Data Variables description
- Aggregation qualifiers are accepted as described in the Data Variables description
- Ordering of qualifiers follows the specification described in the Data Variables description.
Positional qualifiers (e.g., _H_V_R) are generally expected to remain the same at a site. When adding sensors, do not renumber the positional qualifiers without discussing with the AmeriFlux Data Team ([email protected]).