Format QA/QC Tests
Format QA/QC tests are used in the AmeriFlux BASE QA/QC processing pipeline to assess the compliance of uploaded files with the required FP-In format (a.k.a., Half-Hourly / Hourly Data Upload Format).
|Any problems reading file?||If the uploaded file is malformed, it cannot be read.||Error reading data from the file.|
|Is Filename Format valid?||Checks the uploaded filename against the FP-In filename format.||These filename components are not in the standard AmeriFlux format: optional parameter included (will be removed in autocorrected file)|
|Do filename time components match file time period?||The TIMESTAMP_START value in the first data row must match the ts_start component of the filename. The TIMESTAMP_END value in the last data row must match the ts_end component of the filename.||TIMESTAMP_START 199912312330 does not match filename ts_start 20000101000 time.|
|Any invalid Missing-Value Formats?||Looks for common missing value formats, including -6999, NaN, NA, and empty values. Reports the variable names in which invalid missing values are found with the number of times in parenthesis.||Missing values are not indicated with -9999 for these variables (number of timestamps): TA (2); FC (41); TS_1_1_1 (12)|
|Are Timestamp variables as expected?||TIMESTAMP_START and TIMESTAMP_END must be in columns 1 & 2. If they are not, this check reports the variables that are found.||These unexpected variables were found in columns 1 & 2 instead of TIMESTAMP_START and TIMESTAMP_END: YEAR, DAY|
|Are Timestamp variables present?||Looks for TIMESTAMP_START and TIMESTAMP_END in any column. If one or both are missing, this check reports the missing variable.||Expected timestamp variable(s) TIMESTAMP_END is / are missing.|
|Is all Data Missing?||If there is not data present in the file, this test reports an informational message. During Data QA/QC we combine files to create the entire data record. For each timestamp, the most recent value received that has passed Format QA/QC is used. Thus, values in recently uploaded files will overwrite those in previously uploaded files (for the same time period) even if the newer value is a missing value (-9999).||All 20 data variables found in the file have only missing values. Previously uploaded data with the same time period will be overwritten.|
|Any Variables with ALL Data Missing?||Reports variables with all missing values (-9999) as an informational message. During Data QA/QC we combine files to create the entire data record. For each timestamp, the most recent value received that has passed Format QA/QC is used. Thus, values in recently uploaded files will overwrite those in previously uploaded files (for the same time period) even if the newer value is a missing value (-9999).||These variables have all data missing: TA_1; TS_1_1_1. Previously uploaded data with the same time period will be overwritten.|
|Are primary flux Data Variables present?||In most cases, we require at least one of the primary flux variables FC, LE, or H to be present in the uploaded file.||None of the primary flux variables (FC, LE, H) are present.|
|Are any standard AmeriFlux Data Variable names present?||Files are not accepted if they do not contain a few data variables in the AmeriFlux FP-In format.||No data variables in the standard AmeriFlux format are present.|
|Are Data Variable names in correct format?||Checks compliance of variable names with AmeriFlux FP-In format. See Data Variables: Base names for a list of variable base names. A reminder that variable names should not be submitted with the “_PI” qualifier.||These variable names are not in standard AmeriFlux format: TSOIL, NRad, FC_PI;. They will not be included in the standard AmeriFlux data products. Non-standard variables will be saved for a non-standard data product that will be available in future.|
|Any Variables suspected gap-fill?||Reports variables that have no missing values as an informational message to confirm that the variables are not gap-filled. If the variables are gap-filled, use the “_F” variable qualifier. While gap-filled versions of variables are accepted, non-filled data must be submitted for primary flux variables (FC, LE, H). Please also consider submitting non-filled data for all other variables.||These variables are suspected to be gap-filled because they have no missing values: NEE, PREC|
|Are quotes found in all variable names?||Detects the use of quotes around variable names. Quotes around variable names or data values are not permitted.||All variable names have quotes. Quotes are not permitted in the standard AmeriFlux format.|
|Are non-filled data present for primary flux, gap-filled Data Variables?||Primary flux variables (FC, FCH4, LE, H) must be submitted without gap-filling. Gap-filled data can be submitted in addition to the non-filled data.||These primary flux variables are marked gap-filled: FC_F_1_1_1, LE_F_1_1_1. Corresponding non-filled data could not be identified and must also be submitted.|
|Any duplicate Variable names?||Duplicate variable names are not allowed. We temporarily rename the variable by adding a “_d#” suffix so that the remaining Format QA/QC test can be completed for identification of other issues. The temporary names may be referenced in other tests.||Duplicate variable names are present and are temporarily renamed as follows for Format QA/QC reporting: 1 duplicate instance of FC is temporarily renamed FC_d1.|
|Are Timestamps in correct format?||AmeriFlux FP-In format must be used for timestamps. A common error is that timestamp values are treated as a float (e.g., YYYYMMDDHHMM.00) where as it should be an integer or text. This issue can be autocorrected.||275 timestamps in TIMESTAMP_START have invalid format (YYYYMMDDHHMM is standard AmeriFlux format).|
|Any Timestamp duplicates?||Reports duplicated timestamp values. We attempt to autocorrect this issue by removing the duplicate’s entire data row. Gap-filling with missing values (-9999) may be done to fill any resulting time gaps.||4 duplicate timestamps found in TIMESTAMP_START.|
|Is Timestamp resolution OK?||Reports inconsistencies in the timestamp resolution between rows of timestamp values, as well as within a row (i.e., between TIMESTAMP_START and TIMESTAMP_END values in the same row). We attempt to autocorrect this issue by removing the entire erroneous data row. Gap-filling with missing values (-9999) may be done to fill any resulting time gaps.||3 timestamps in TIMESTAMP_START have invalid resolution HH within or between rows
3 timestamps in TIMESTAMP_END have invalid resolution HH within or between rows
|Timestamp problem encountered.||Reports that timestamp tests could not be completed.||These Format QA/QC assessments could not be completed: Do filename time components match file time period? Is Timestamp resolution OK? Any Timestamp duplicates?|
|File Conversion Successful?||Reports an issue while extracting contents of a .zip or .7z file.||File with zip extension does not appear to contain any files.|
|Autocorrections that can be attempted if failed issues are addressed in replacement file.||Reports issues that we can attempt to automatically correct if blocking issues are corrected in a replacement file. See examples.||Changed dat extension to CSV.
Fixed invalid variable name TIMESTAMP_END with TIMESTAMP_END: whitespace removed
Fixed invalid variable name FC_1_1_1_F with FC_F_1_1_1: qualifiers re-ordered
|Issues that cannot be autocorrected.||Reports issues that were found and cannot be corrected.||Timestamps are in scientific notation and cannot be fixed
File could not be converted/extracted to csv.
Data QA/QC Tests
Data QA/QC tests are used in the AmeriFlux BASE QA/QC processing pipeline to assess the quality of data.
More information coming soon!