Capstone Project

The Soil Moisture Project

By Kendall Rosenberg

The Soil Moisture Project; what did I do?

In this study, I worked with Imtiaz Rangwala, a water and climate research scientist with the University of Colorado, Boulder, and Gabriel Senay, a research physical scientist from the USGS, to turn raw soil moisture data into a useable format for study, and to standardize this data across any desired timescale. Mainly the focus of this project was to create a usable tool be able to pull in this soil moisture data in for analysis, and to be able to standardize it for use in future research.

With this data broken down, standardized, and ready-to-go, the next step in the analysis is to do a regression analysis to compare trends in soil moisture with trends in drought index models like LERI! Being better able to model drought means being better able to predict it, and with better prediction comes better preparation. Better preparation for drought might help save famers from decreased agricultural yield, it might save people from starvation and disease, and it could help prevent property damage via wildfire.

Project Goals:

The first goal for this project was to develop a Jupyter Notebook workflow using Python to extract soil moisture data and store it as a csv file. Second, we wanted to be able to take this csv file and summarize the soil moisture data contained at any timescale for use in analysis. Finally, we wanted to develop a workflow that can standardize the soil moisture data at different depths for different timescales. With this new data in hand we hope to be able to compare soil moisture values to drought index models to see how they compare!

Introduction: Why should we care?

Soil moisture has a few different definitions depending on the context of the situation, but generally it is defined as the amount of water held in the spaces between soil particles. Because so much of life on earth is controlled by water in terms of its accessibility and quality, being able to accurately manage all aspects of water is important in many areas. Soil moisture measurements are important to farmers and agriculture, water resource managers, weather forecasters, and can be useful in climate research going into the future.

What work has been done on this topic already?

Soil moisture measurements have very recently become a hot-topic in the scientific world. In November 2013, the National Integrated Drought Resilience Partnership hosted a workshop in Kansas City, Missouri to begin to bring about a national network to measure soil moisture, and to incorporate that network into research and development going forward. Experts gathered from many relevant organizations (SCAN, NOAA, NASA, USDA) at this workshop, and began the development of the National Soil Moisture Network (NSMN), which is the network we are using in this study. Many of these experts met again in 2016 in Boulder, Colorado at NOAAs Earth System Research Laboratory to further discuss how to better communicate and coordinate this new soil moisture network on the federal, state, and private sector levels (SOURCE).

Currently, in-situ soil moisture measurements are not well-incorporated into the scientific world, because it is a relatively new measurement (only goes back 20 years), measuring stations are sparse, and they are expensive to maintain. As it stands, soil moisture values used in many models of drought, water management, etc., are a result of remote sensing using microwaves to estimate soil moisture values based on emission response of land surface (SOURCE). While these methods may be sound and provide a good spatial resolution of soil moisture values, they cannot penetrate very deep into the ground (only about 10cm at most). In-situ measurements allow us to examine depths of up to 100cm, which can be very useful for many reasons. Very shallow (2cm) measurements can be useful to help calibrate/verify satellite measurements of soil moisture, 5cm and 10cm could give insight into available water at the root zone (for plant use; agricultural use), 20cm and 50cm could give insight into local streamflow characteristics, and finally, 100cm depth measurements could be useful as an indication of drought intensity (SOURCE).

Study area and stations used for analysis

I began by going to the National Soil Moisture Network (NSMN) site, and examined stations around the US that were used by the Soil Climate Analysis Network (SCAN) to measure soil moisture at depths of 5cm, 10cm, 20cm, 50cm, and 100cm. I selected 9 sites that show >15 years of historical soil moisture (SM) data and that did not have big gaps in measurements.

While these are the sites i initially chose for analysis, this workflow will work for any SCAN station data CSV pulled from the SCAN website. The main function of this tool is to be able to look at any SCAN station's data as needed.

Station locations:

  1. Bushland, Texas

  2. Nunn, Colorado

  3. Ft. Assiniboine, Montana

  4. Mandan, North Dakota

  5. Lind, Washington

  6. Beasley Lake, Mississippi

  7. Eastview Farm, Tennessee

  8. Mammoth Cave, Kentucky

  9. Abrams, Kansas


site map of united states
Figure 1: Map showing the location of the stations I pull soil moisture data from. As you can see, i tried to pick stations that had somewhat of a wide variety of landscape, as that will hopefully create new avenues of exploration using the data and findings of this study. SOURCE: National Soil Moisture Network SCAN stations


With these 9 sites in hand, I began by creating a python script that holds all of the imported soil moisture data. I could then call this data within my final notebook to work with! With these initial soil moisture dataframes I began to build a notebook that would be able to take these dataframes and split them down into any timescale the user wished to look at. From there I made sections that would perform a z-score calulcation on the dataframes and plot the output (example plot below). With this output the user would be able to examine deviations in soil moisture from the mean on any timescale for any of the 9 sites.

Timescales examined and standardized in the final notebook tool:

  1. Annual mean

  2. Month of Year mean (seasonal)

  3. Monthly mean

  4. Daily mean

  5. Decad (10-day section) mean

  6. Pentad (5-day section) mean

With this tool I have developed, a user can examine and use any SCAN station soil moisture data from the NSMN. They can examine the raw data on any timescale very easily, and can use the standardized data for future research.

NaN Analysis:

Many of the stations contain sporatic missing data values and many of the stations have large gaps where there was no data collected, maybe a couple of months out of a year. In order to accurately calculate mean values, we had to do some NaN cleaning of the datasets before calulcating mean. For this, the idea is simple: if a section of time we were averaging (based on timescale) didn't contain enough data, we excluded it from the mean calculation. For annual data, each year had to contain at least 275 days of data to be included in the mean calulcation. For monthly data, they had to contain at least 15 days, and for weekly data, they had to contain either at least 5 days (decad- 10 day averaged sections) or 3 days (pentad- 5 day averaged sections). This step was very important both for accurate means and for further z-score calculation below.

Z-Score calculation:

This workflow incorporates the SciPy Stats Z-score function to calculate a z-score value for each point in a timescale. Basically what this means is that each value in a dataset is compared to all the rest of the values, and the z-score shows how many standard deviations from the mean that value is (AKA how that value is different from the overall average). In this case, a more positive z-score indicates wetter than normal soil, and a more negative z-score indicates drier than normal soil conditions.

Example outputs of this tool:

standard annual soil mean moisture charts

Figure 2: Figure showing z-score values for Nunn Station, Colorado, for each year on record. Blue (positive) bars indicate the degree (or number of standard deviations) to which that year/depth was wetter than normal, and red (negative) bars indicate the degree to which that year/depth was drier than normal. Blank spaces indicate that there was not enough data for that year to calculate an accurate mean.

Initial Findings:

IInitial findings in Nunn colorado

Figure 3: Figure showing intial results on how standardized soil moisture values compare to LERI drought index values for the month of July, across all years of data. You can start to see the rough correlation, especially in the upper 10cm of soil. This was what we expected, as remote-sensed soil moisture can only penetrate ~10cm down in the soil.


This tool has a great significance for research going forward as it makes it easy to pull in and analyze soil moisture data. Because soil moisture is an increasingly important and underutilized parameter in science today, this is a great first step toward its incorporation going forward. Being able to readily access this data will allow researchers like Imtiaz and Gabriel to pull it into their research with ease.

A more general audience should care about this because now researchers can use this in-situ soil moisture data to potentially better predict drought or drought-prone areas. It can also shed alot of light on water balance across the world, which could assist water resource managers and farmers in preparing for or predicting dry/wet conditions. This data may also shed light on soil/plant characteristics of many different landscapes, as this tool allows one to look at soil moisture trends at essentially any SCAN site across the US.

Future Work / Future Studies:

  1. Create a workflow in Python to examine the relationships of the standardized SM data at different depths with select drought indices such as LERI, PDSI, SPEI, EDDI as a time series analysis. This drought index data can be examined easily using a tool from the University of Colorado, Boulder here

  2. Compare soil moisture trends at different depths with other natural factors to examine correlation! One suggestion was that very shallow (2cm) measurements can be useful to help calibrate/verify satellite measurements of soil moisture, 5cm and 10cm could give insight into available water at the root zone (for plant use; agricultural use), 20cm and 50cm could give insight into local streamflow characteristics, and finally, 100cm depth measurements could be useful as an indication of drought intensity. Perhaps this data could be applied/correlated to each of these parameters to see what the relation is using true in-situ soil moisture data!

Link to GitHub Repository:

Link to GitHub repository can be found here

  • You can examine the above repository to access all files and functionality of the soil moisture tool! This also includes a README explaining the functionality and outputs of the tool.

Research Collaborators:

  • Gabriel Senay, Ph.D., P.E. Research Physical Scientist, U.S. Geological Survey Earth Resources Observation & Science Center/ North Central Climate Adaptation Science Center and Faculty Affiliate with Ecosystem Science and Sustainability.