Data shared through the AmeriFlux Network, and globally through FLUXNET, offer a unique tool to answering scientific questions across different spatial and temporal scales. This is only possible through the collective effort of individuals making measurements at their own sites, and taking extra effort to share the data. Enabling answers to these kinds of questions is motivation enough for the existence of the flux networks themselves. However, as an individual site investigator, what are the actual or practical benefits of joining AmeriFlux and sharing data openly with the community? We explore a few of these benefits, and argue that anticipated downsides might actually not matter.
Why join? Benefits
Sharing scientific data openly has many benefits. We focus on the aspects that are relevant to AmeriFlux, and other regional flux communities in general.
- Data are used (a lot!). It’s no good to put in the hard work of preparing data to be shared, and then nobody uses it. Data shared through AmeriFlux and FLUXNET get used! These are among the most used data sets currently available. You can find over 4,000 mentions of AmeriFlux in publications on Google Scholar, and around 13,000 for FLUXNET. This metric is far from ideal for measuring usage or impact of a data set, but it gives you the general idea.
- Increased visibility. Your research and the research developed at your sites get increased visibility. Work and publications involving your sites and your results can be listed in the shared community archive, but, more importantly, people trying to understand your site will look at (and cite) these publications and results.
- Credit to data work. All the work done in data collection still has a limited impact in scientific careers, even though it is highly technical and time consuming, and a high-value contribution to research. Joining AmeriFlux to share your data allows assigning credit to your work collecting and processing. With new mechanisms (like DOIs for individual AmeriFlux sites’ data sets) being adopted by AmeriFlux, this type of scientific work can be incorporated into metrics for career evaluations.
- Reduce “data mishaps.” Sharing your site data through a network that uses standards both for data formatting and processing will help prevent both data misuse and data being scooped, two of the major problems with data sharing. By using standard methods for multiple sites, the data sets become well defined in their scope and limitations, which helps prevent both accidental and malicious misuse. It’s also harder to use a data set and not give proper credit if the data set is publicly available and known to the community.
- New collaborations. When people from the community can look at and play with your data, new ideas inevitably emerge, and possibilities for collaborations soon follow. Combining independent sites into gradients or contrasting new methods across different ecosystems are only the simplest examples of this effect.
- Data management requirements. Since your data are deposited and managed at the network level, demonstrating that your work meets data management requirements from funding agencies becomes straightforward. It will even help with preparing new data management plans for proposals, since you’ll inevitably become aware of how the network carries out data management services.
- Community integration. When applying for funding, it’s much easier to demonstrate that you’re a member of a larger community: the data that you’re generating are not only relevant to your own research, but also to the scientific community at large. This strengthens the proposal, and makes funding agencies happy with the larger impact of the scientific work and efficient use of resources.
Access to AMP services (data and tech)
- Data archives. This primary service offers both fluxes and high frequency data archives – this includes tools for automated uploads, which make for efficient backups. Besides all the other benefits of having these data deposited with the network, it is also a good off-site backup mechanism.
- Data collection QA/QC. Being part of the AmeriFlux network gives you access to technical support and QA/QC resources, such as loans of CO2 standards and calibrated PAR sensors. The AMP Tech team also makes site visits during the summer field season for sites that request visits.
- Data QA/QC. After data files are submitted to the AmeriFlux website, the AMP Data team evaluates data sets systematically, making sure data are consistent with network standards. They work closely with tower teams to resolve any data issues identified.
- Improved data collection. The combined effect of the Tech and Data teams QA/QC help improve measurements and data collection methods at a site. This increases the data quality and helps prevent future data collection issues.
- Standard data products. The standard data post-processing pipeline generates uniform and comparable data products across different sites. This helps with synthesis type studies, but also provides a baseline against which new data processing methods and implementations can be tested independently at each site. These data products are standard across AmeriFlux and ICOS in Europe, and are also being used for FLUXNET data releases.
- [SOON] Data set DOI. Assigning DOIs to site data sets enable data users to cite data sets directly on publications. This will help with compiling metrics for the impact of a particular data set or tower site in publications, as is being done by some initiatives already.
The cherry on top: AmeriFlux and FLUXNET are awesome communities! Participating in the community and contributing your data will not only be good professionally, but also a lot of fun!
Why not join? Risks
There are also some common arguments for not joining networks (or waiting to join a network in the future rather than now). We cover a few of these counterpoints, to show why we think they do not apply to AmeriFlux.
Data getting scooped
No one will deny: there is a risk that someone will use your data and not give you credit or get to a finding before you do. However, this risk is likely very low. Crime does not pay in Science: scientific communities are tightly connected, and uncredited use of a data set is now very easy to spot, and leaves a bad mark on a scientist’s career.
The other aspect is someone else getting to a finding before you’re able to. However, no one really knows your data and your site as well as you do, so it’s unlikely someone will discover something specific about your site before you. In studies with multiple sites, this would be more likely to happen, but you also get an increased chance by having access to data from others.
Finally, sharing your data within a network actually helps prevent these two types of downsides (see science benefits above). You not only get credit and recognition for your data collection and management work, but this work is actually documented with the network.
Data being misused/misinterpreted
This is one of the stronger and valid arguments that would work against sharing data, and is a legitimate argument. However, as was the case with “data scooping,” joining a network actually prevents data misuse and misinterpretation by using standard formats, standard data products, and well documented and widely adopted methods.
Too costly to prepare data for sharing
It’s definitely true that it’s expensive to prepare data to be shared. However, the experience of most—if not all—more senior scientists who have shared data openly and widely in their careers is that what you get in return makes up for the cost many times over. The perks range from getting invited to collaborate and publish together in works using your data, to being recognized in the field from other people using your data. Another benefit of putting in the effort is to have good documentation on your data collection and processing. This is crucial for continuing data quality when multiple people handle these tasks, especially students and post-docs who are not part of the team for the whole duration of the operation of the tower.
We hope these provide good arguments as to why it is also beneficial for individuals, and not only the networks and their data users, for sites to be an active part of a network and share data openly with the community.
What are your thoughts on this? Did we miss reasons why someone should (or should not) join AmeriFlux (or any other research network)? Post your comments, and we will discuss and respond as new opinions and points are raised. Or contact either of us directly.
by Gilberto Pastorello, AmeriFlux Data Team
The AmeriFlux Data Team at Twitchell Island. Patty Oikawa (in white) is showing how they take measurements at this AmeriFlux site. This experience was so that the Data Team could better understand methodology and challenges in the field. Gilberto (far right) is holding the 50-meter measuring tape. From right: Cristina Poindexter, Rachel Hollowgrass, Olaf Menzer, Patty Oikawa, Gilberto Pastorello.