A new era of forest resiliency research
Pushing forward a new frontier of research on forested ecosystems by bringing together data, theory, and diverse scientists and perspectives
Forests are critical ecosystems that cover about 30% of the Earth. They provide wildlife habitat; filter water and air; and provide building materials, food, fuel and medicine for humans. Forests also hold the rich histories of indigenous peoples and profoundly shape the cultures and societies that grow out of and around them. At a time when anthropogenic (human-originating) climate change is driving changes and hazards around the world, forests also help to regulate the global carbon cycle, storing as much as 45% of terrestrial carbon.
However, once-sprawling forests are becoming increasingly threatened. Humans are rapidly removing forests for agricultural land and development at a global scale. At the same time, forests are being impacted by climate-exacerbated disturbances such as wildfires, pests and diseases, and droughts.
A key component of research on forested systems is the study of ecosystem resiliency: the ability of a system to persist in the face of a disturbance. By preserving forests, improving their resiliency, and understanding their mechanisms of change we can help combat climate change and protect the planet's biodiversity and resources for future generations.
Today, big data and cloud computing are opening doors for a new frontier of forest resiliency research. The question now is how we can capitalize on those changes and ensure that the forest research of the future is inclusive, useful, and groundbreaking. Recently, Earth Lab tried to answer that question in the form of the 2023 Forest Resiliency Data Synthesis Working Group.
The Era of Big Data
For centuries, forest researchers have relied on manual techniques to gather information about forest systems. Scientists wrapped trees with tape measures, painstakingly documented the abundance of insects and animals, and dug pits to examine the composition and layering of forest soils. Popular science books such as Peter Wohlleben’s “The Hidden Life of Trees” demonstrate just how far we have come in our understanding and appreciation of these systems through such methods.
Yet, endless questions remain. For example, we don’t know how forests across large scales are responding to combined disturbances in a rapidly changing climate, or how forest carbon storage trajectories across different ecoregions are affected by wildfires.
With the advent of big data, we have been provided with a means to answer such questions. Remotely sensed data from satellites, airplanes, and drones now allow us to study large areas of forested lands on the scale of centimeters, help us create 3-dimensional models of forest structure, and provide us with species-specific identifying information beyond human perception. More data than we can analyze are becoming available every day. At the same time, computational power is increasing and becoming more openly accessible through resources such as CyVerse, a National Science Foundation-funded cyberinfrastructure platform.
In this new research context, the ways in which we choose to integrate perspectives and data will shape a new frontier of research. While forest field data remain essential for assessing the accuracy of remotely sensed data, it is the integration of multiple datasets that now allows researchers to bridge gaps between scales and disciplines. Intentional decision-making in this process provides the opportunity to craft deeper and more complete understandings of complex forested systems.
The Forest Resiliency Data Synthesis Working Group
In February of 2023 a group of 32 scientists and forest managers came together at the University of Colorado at Boulder for the Forest Resiliency Data Synthesis Working Group. The working group was a multi-disciplinary, collaborative experiment in training participants in data- and compute-intensive workflows for forest resiliency research while also exploring critical forest-related synthesis questions. Those questions were related to how patterns of resilience or transformation after disturbances vary through time and space in the western United States.
The group was composed of individuals from across varied forest research disciplines, career stages, academic institutions, locations, and industry sectors. The event was supported by 15 Earth Lab staff members, and funded by NSF’s Macrosystems program and NSF’s newest data synthesis center, ESIIL (the Environmental Data Science Innovation & Inclusion Lab).
Over the course of 4 days participants were exposed to computational workflows for new datasets through educational modules; heard from subject area experts on topics such as hyperspectral remotely sensed data, national observatory data collection, and drone operations; and engaged in discussions about potential new opportunities for collaborative forest research.
Synthesizing a New Frontier
Through collaborative brainstorming and synthesis activities related to linked and compound disturbance, participants at the Forest Resiliency Data Synthesis Working Group identified rich spaces for future research as well as critical data gaps. While the identified topics are just a tiny snapshot of the forest resiliency research frontier, the exercise demonstrates the importance of bringing together a wide range of perspectives in a time of enhanced data availability.
With terabytes of data available to researchers at any moment, the design of research questions is more important than ever before. Multiple perspectives are necessary to identify the most tractable and useful avenues of research within a data-rich space. They are also needed to bring knowledge of available data sources and ensure that varied data sources are used and interpreted correctly.
Data and Computing For All
The new, data- and compute-heavy era of forest research provides an opportunity to champion inclusivity, equity, transparency, reproducibility, and collaboration. In the wider scientific community, this movement is known as “Open Science.”
While Open Science has many proposed definitions and may be best thought of as a spectrum from more-to-less open, a possible definition is: ”the movement to make scientific research (including publications, data, physical samples, and software) and its dissemination accessible to all levels of society, amateur or professional.”
By encouraging open scientific practices, forest resiliency researchers can make more voices heard in discussions of these complex systems, improving the quality of research while working toward a more just society. In order to participate in open forest resiliency research, individuals need three primary tools in addition to a broad, contextual understanding of forest resilience and open science best practices: computational skills (the ability to code), computational power (access to resources and infrastructure), and access to data.
The 2023 Forest Resiliency Data Synthesis Working Group provides a case study of how to provide these tools to participants in a workshop setting, while demonstrating the possibilities of accessible and collaborative research. At the same time, a lesson learned from the event is the importance of instruction related to being able to reproduce and use these three primary tools outside of a working group space, empowering participants to more easily employ methods in their own work after the event.
Computational Skills
Development of openly accessible educational resources is an essential pathway toward democratizing access to science. In research environments this can mean taking the effort to make workflows accessible, understandable, and ultimately reproducible. The forest resiliency working group contained four educational data modules in R and Python designed to show participants workflows for processing and analyzing ‘big data.’
Computational Power
Not everyone has the same access to a high-performance computer. For the forest resiliency working group, publicly-funded resources in the form of a CyVerse Jetstream allocation were used to create a user-friendly cloud computing environment for participants. Rather than relying on personal computers, each individual was able to log in to a pre-loaded user interface for programming, removing the barrier of setting up specialized coding environments. With only an internet connection, any user could have access to a high-end computational system.
Access to Data
Without data to analyze, researchers can’t get very far. In the new age of data availability, countless data sources are freely and openly available from governments, nonprofits, and even private companies. A set of useful datasets were pre-processed for easy use during the forest resiliency working group and subsequently loaded onto the prepared cloud environment. As a result, all users had access to the same datasets without needing to acquire them manually.
Don’t Forget the Trees
By leveraging the power of big data, creating accessible spaces for computation and learning, and pulling on established theory from multiple disciplines and perspectives, researchers can gain insights into complex forested systems that were previously unimaginable.
The possibilities are incredible. But, whether we’re using field-based data written on paper spreadsheets or integrating terabytes of digital data from multiple remotely-sensed sources, it’s also crucial to ensure that this new frontier is grounded in what brought us here: trees and forests themselves. Observations are the basis of questions and hypotheses, and spreadsheets and pixels are no substitute for the experience of walking through a forest and wondering about a shift in the rustling of leaves.
We must strike a balance between experience and data, intuition and analysis. By doing so, we can fully realize the potential of this new frontier.
Are you a forest scientist? Or an aspiring one? If you want to develop skills for a new age of forest research, check out these resources:
- EarthDataScience.org: Earth Lab’s site containing open tutorials and course materials covering topics including data integration, remote sensing, and data intensive science. Tutorials are primarily in Python, with some in R.
- Geocomputation with R: an online textbook on geographic data analysis, visualization and modeling in R
- Remote Sensing with Google Earth Engine: An online textbook for learning to work with Google Earth Engine, a geospatial cloud computing platform accompanied with access to a full library of geographic data.