Making Teaching Technical Earth Data Science Skills Easier Through Free Open Source Software

There are significant challenges to teaching earth data science. Our students submit all assignments using GitHub to ensure that they are immediately and consistently applying skills learned in our program. However, collecting and managing dozens of repositories on GitHub and grading assignments can cost significant amounts of time. Free and open source software (FOSS) refers to free tools with code that is easily accessible for editing. To address the challenges associated with managing student homework assignments and grading, we have built two open source software tools. 

EarthPy is an open source Python package that helps users to do common tasks with spatial data.
Abc-Classroom is a Python tool that makes it easier to download (clone) and access student submissions, and also return feedback to GitHub. 
MatPlotCheck is a tool that allows teachers to automatically test matplotlib plots for data accuracy, legends, titles, symbology, and more. 

 We use these tools combined with nbgrader which is a part of the Jupyter ecosystem and developed by Jess Hamrick to collect, grade, and return feedback for student assignments which are completed using Jupyter Notebooks.

Accessing Teaching Data Sets and Spatial Data Exploration Using EarthPy

Accessing data in a reproducible way that is both suitable for teaching specific data and programming approaches can be challenging. The code needed to complete some commonly implemented spatial data workflows is complex, and can be simplified using standardized functions and approaches. To address these unique challenges, we developed and maintain a Python package called EarthPy that makes working with spatial data and downloading teaching data subsets easier.

Example of a Digital Elevation Model created with the help of EarthPy.

A Classroom in the Cloud: JupyterHub, Google Cloud, and the Earth Analytics Python Environment

One of the biggest challenges associated with teaching data science to scientists is environment setup. Specifically for Python, there are numerous options for setting up environments, some of which can be frustrating to get working. To address this challenge we have developed:

Our JupyterHub is a multi hub setup which means we can deploy different hubs, supported by different environments and levels of compute power (memory, processing power, etc) depending upon our teaching needs. The Hub infrastructure is easily customized through GitHub pull requests. The JuypterHub provides an online environment where students can write code and access data using a browser with internet access only. This environment: (1) eliminates the need to install or save files on their computer, (2) provides additional compute power for students working with larger datasets, and (3) removes the challenges associated with installation and setup on local computers.
To support the JupyterHub and also students who may want to set up an environment locally we created a customized, tested and maintained conda environment. This environment is tested on Windows, Mac and Linux to help us identify and address conflicts that may arise when a student attempts to install it locally.

 We Contribute to and Support Free and Open Source Software for Science

We use and are committed to Free and Open Source Software (FOSS) in all of our teaching efforts. Free and open source software directly supports open reproducible science because everyone has the same access to these tools. This also accelerates science by allowing scientists to easily and quickly build upon each other's shared workflows. As such we have contributed back to several tools including GeoPandas and nbgrader. We are also leading pyOpenSci, an effort focused on building a diverse community around Python FOSS tools for science.   

 

 

Project Team

Project Lead

Nathan A. Quarderer

Nate is an educational researcher currently focused on the topics of data science education, and on how people come to know about climate change and why they hold a particular set of beliefs. At Earth Lab, Nate helped organize and implement the Earth Data Science Corps program, leading assessment and evaluation efforts.