Teaching Earth Data Science Skills Backed by Market Research
As the availability and volume of earth data continues to grow, and data processing workflows evolve to be more complex, data science skills are becoming increasingly fundamental to science. In tandem, science and industry have become more collaborative and interdisciplinary. Given this context, there is a growing need for professionals with skills at the intersection of science and data science who can also work effectively in interdisciplinary teams and communicate their work.
To identify specific program learning goals that are in market-demand, we survey hiring managers, academics, and professionals in the earth and environmental sciences about the core skills that they seek for new hires in data science positions. Using this data, we’ve built our curriculum around five learning areas: technical data science, domain science, ability to use different data types and structures, scientific communication and interdisciplinary collaboration. These in-demand skills prepare students and professionals for careers in data-intensive science that address a variety of large scale environmental challenges and are responsive to rapid changes in technology.
Training the Next Generation of Earth Data Scientists
A First of Its Kind Earth Data Analytics Professional Graduate Certificate
In support of our mission to train a data-capable earth and environmental science workforce, we’ve created one of the first professional earth data science programs in the country. The three course, 10 month Earth Data Analytics - Foundations professional graduate certificate program trains students who are new to earth and environmental data science in the skills required to integrate data-intensive approaches into their careers. In as little as 10 months of online or in person instruction, the program provides students with a powerful combination of skills at the intersection of earth and data science.
Increasing Diversity in STEM By Building Earth and Environmental Data Science Teaching Capacity at Partner Institutions
Through a grant from the National Science Foundation - Harnessing the Data Revolution, we have designed and are leading a three-year Earth Data Science Corps program which builds sustainable capacity for faculty to teach earth data science at institutions serving students historically underrepresented in STEM. Our focus has specifically been on students at Tribal and Hispanic Serving Institutions including:
- Oglala Lakota College in Kyle, South Dakota
- United Tribes Technical Institute in Bismarck, North Dakota, and
- Metropolitan State University of Denver
The program includes online and in-person training for faculty and students to learn technical data skills, focused training to help faculty embed data intensive content into their courses, an applied internship where students develop skills-learned through real world projects (project-based applications), and a full-semester course.
Evaluation is core to the EDSC project to assess the program’s effect on student skill attainment, self-confidence, career interest, and career persistence in STEM as well as understand how students best learn in online environments. In our first year of data, students reported that the program had a positive impact on their sense of science identity and belonging and that they built skills that prepared them for their future careers. Students also enjoyed the flexibility and convenience of online learning, but generally indicated that they preferred in-person instruction.
Earth and Environmental Data Science Workshops For Students of All Levels
For students and professionals who want a basic introduction to earth and environmental data science or to learn targeted skills, we offer workshops that teach the core programming and open reproducible science skills to work with environmental and earth systems data in collaborative team environments. Workshops are aimed at participants of all skill levels and backgrounds and are generally offered fully online.
Earth and Environmental Data Science Core Skills
Technical Data Science Skills
Our earth analytics education programs teach the scientific programming, version control and command line skills required to create efficient, open and reproducible workflows to process earth data. Currently we specifically focus on Python, Git and GitHub, and bash given these tools were consistently listed as in demand in our industry surveys. Knowledge of these tools is meant to serve as a basis for learning other programming languages and software in today’s evolving data science landscape.
Understand Scientific Applications of Data Science
While data skills can be applied in almost every field, science domain knowledge differentiates students when they enter the job market. We integrate earth and environmental science into our data science lessons by teaching students how to frame a scientific question, identify appropriate data, and produce a useful final product.
Find & Work With Different Types of Data
Students finish our programs with the ability to efficiently find and work with different data types and structures. Our lessons cover how to work with spatial, remote sensing, and time series data that comes in raster, vector, and hierarchical formats. We also teach students how to combine these different types of data, critical for harnessing the rapidly growing number of data sources available today.
Communicate & Collaborate in Synchronous & Asynchronous Interdisciplinary Environments
As science and industry become more interdisciplinary and work environments increasingly flexible, strong communication and collaboration skills are critical to professional success. We teach these skills to students through group work, communication-focused activities, project based-learning and tools like Git/GitHub, Slack, and Discourse.
Promoting Open, Reproducible Workflows that Accelerate Science
Open reproducible science occurs when a scientist makes their workflow available for others to view, use and run from beginning to end. It involves connecting data inputs, processing methods and outputs with supporting documentation that allow peers to replicate the process. Reproducible scientific workflows allow scientists to build upon one another's methods rather than begin from scratch, boost the visibility of scientific work, allow peers to check for errors or provide feedback and increase the efficiency of the science as a whole. Our program teaches the process of developing open, reproducible workflows using real-world data and common open science tools such as Python programming, Git and GitHub, bash and Jupyter Notebooks.
Our curriculum exposes students to each step required to develop, implement, and communicate the components of a science project. This includes articulating a challenge to a broad audience, developing a reproducible data processing workflow to address the challenge, and communicating the results in both written and verbal formats. Through our courses, students develop the skills and confidence needed to independently define, find data for and complete data-intensive projects to address scientific questions and challenges.