Data QA/QC

Data QA/QC assessment identifies potential data quality issues. It is a secondary quality assessment that complements the primary check performed by the site team. Data QA/QC follows the methodology adopted for processing the FLUXNET2015 dataset (Pastorello et al., 2014; 2020) and includes additional checks based on data user feedback. Read a general overview of the Processing Workflow and Data QA/QC below and in the following paper.

Chu, H., Christianson, D.S., Cheah, YW. et al. AmeriFlux BASE data pipeline to support network growth and data sharing. Sci Data 10, 614 (2023). https://doi.org/10.1038/s41597-023-02531-2

Processing Workflow

After passing Format QA/QC, uploaded files are combined with, if any, previously published BASE file (1 in figure below). The automated Data QA/QC codes generate statistics and figures that AMP reviews (2). If the data Pass Data QA/QC, AMP notifies the site team of any corrections needed (3). Otherwise the data are queued for BASE Generation and bundled with BADM for publication as the BASE-BADM data product (4).


AMP does not perform any data filtering and correction during Data QA/QC processing. Identified issues must be addressed through resubmission by the site team.

QA/QC Test Modules

Data QA/QC assesses units and sign conventions, timestamp alignments, trends, step changes, outliers based on site-specific historical ranges, multivariate comparisons, diurnal/seasonal patterns, USTAR (i.e., friction velocity) filtering, and variable availability. Read more details for the test modules:

Report Components

During Data QA/QC, AMP synthesizes the identified issues into a concise and actionable report. The Data QA/QC Summary briefly explains the Data QA/QC results. Data issues and Explanatory Figures detail the identified issues and potential solutions. All generated figures and the Format QA/QC Report associated with the data are provided in Additional Links. AMP emails the Data QA/QC Report to the site team for clarification or correction.

References

Pastorello, G., et al. (2014), Observational data patterns for time series data quality assessment, paper presented at e-Science (e-Science), 2014 IEEE 10th International Conference on e-Science, Sao Paulo DOI:10.1109/eScience.2014.45

Pastorello, G., et al. (2020), The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data, Scientific Data, 7(1), 225, DOI:10.1038/s41597-020-0534-3