Academic Requirements (56 total units)
15 Core Courses (48 units)
Core curriculum in data science, workflows, evaluation and analysis, and data visualization
Capstone Project (8 units)
MEDS capstone projects are designed to develop professional problem-solving skills.
Summer - Session B: EDS 212, 214, 217, 221
Fall: EDS 220, 222, 223, 242
Winter: EDS 211, 232, 240, 241, 411A
Spring: EDS 213, 230, 231, 411B
Science in general, and data science in particular, are more and more requiring team science approaches to addressing the most pressing questions. Managing team science projects is therefore becoming an increasingly important skill for any scientist. This course will explore the principles and practical tools available for effective and efficient project management.
Review of quantitative methods that are commonly used in environmental science. The course will cover single and multivariable functions and graphing, basic linear algebra, complex numbers, integral calculus and simple differential equations.
This course will teach students how to store and manage environmental information. The course will focus on relational database structure, schemas and data relationships, and introduce SQL as a means to create and query databases. This course also covers the concept of metadata as well as archiving data products on data repositories to make them available to the broader community.
The generation and analysis of environmental data is often a complex, multi-step process that may involve the collaboration of many people. Increasingly tools that document and help to organize workflows are being used to ensure reproducibility, shareability, and transparency of the results. This course will introduce students to the conceptual organization of workflows (including code, documents, and data) as a way to conduct reproducible analyses. These concepts will be combined with the practice of various software tools and collaborative coding techniques to develop and manage multi-step analytical workflows as a team.
This course teaches the fundamentals of programming in Python. Students will learn foundational skills and concepts including data structures, programming basics, and how to clean, subset, aggregate, transform and visualize data. Course materials demonstrate the application of these techniques for environmental data analysis and problem solving.
Introduces students to the broad range of data sets used to monitor and understand human and natural systems. Course will cover field and station data, remote sensing products, and large-scale climate datasets including climate model projections. Skills will include evaluating data collection and quality control methods used in existing datasets, and working with existing databases of time-series and spatial information including cloud computing databases and new repositories of environmental datasets. Students will learn basic workflows for selecting, obtaining, and visualizing datasets, and best practices for reliable data intercomparisons. Students will gain hands-on experience with an environmental dataset of their choice by developing tutorial Jupyter notebook materials for a relevant use case.
This course teaches key scientific programming skills and demonstrates the application of these techniques to environmental data analysis and problem solving. Topics include structured programming and algorithm development, flow control, simple and advanced data input-output and representation, functions and objects, documentation, testing and debugging. The course will be taught using a combination of the R and Python programming languages.
This course teaches a variety of statistical techniques commonly used to analyze environmental data sets and quantitatively address environmental questions with empirical data. The course covers fundamental statistical concepts and tools, including sampling and study design, linear regression, inference, and time series analysis, as well as foundational concepts of spatial and space-time dependency and associated impacts on inference.
This course introduces the spatial modeling and analytic techniques of geographic information science to data science students. The emphasis is on deep understanding of spatial data models and the analytic operations they enable. Recognizing remotely sensed data as a key data type within environmental data science, this course will also introduce fundamental concepts and applications of remote sensing. In addition to this theoretical background, students will become familiar with libraries, packages, and APIs that support spatial analysis in R.
Computer-based modeling and simulation for practical environmental problem solving and environmental research. The course will cover both the selection and application of existing models and best practices for designing new models. Topics include conceptual models, static and dynamic models, and models of diffusion, growth and disturbance. Techniques include sensitivity analysis, calibration and model scenario design.
This course will cover foundations and applications of natural language processing. Problem sets and class projects will leverage common and emerging text-based data sources relevant to environmental problems, including but not limited to social media feeds (e.g., Twitter) and text documents (e.g., agency reports), and will build capacity and experience in common tools, including text processing and classification, semantics, and natural language parsing.
Machine learning can help process big/complex data and extract knowledge. It forms one of the foundations in data science. This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include supervised learning (decision tree, random forest, support vector machines, neural networks) and unsupervised learning (clustering, dimensionality reduction, deep learning). Problems and exercises are framed within environmental science applications. The course will use programming languages like R and Python to support learning how to do advanced scientific programming to solve real environmental problems.
This course will focus on basic principles for effective communication through data visualization. Students will deepen their understanding of how people perceive and interpret graphical representations, and will learn about information visualization frameworks they can apply to design intuitive and impactful data visualizations. Beyond effective visualization design, we will explore ‘storytelling with data’ --integration of visual elements and text in a way that is clear, concise and engaging. Class time will consist of brief periods of lecture interspersed with small group and whole group discussions, peer critiques, and hands-on data visualization activities. Assignments will involve applying such frameworks and concepts in critique of existing visualizations, and in creation of data visualizations using popular software packages.
This course will present state of the art program evaluation techniques necessary to evaluate the impact of environmental policies. The program evaluation methods presented will aim at identifying and measuring the causal effect of policies, regulations, and interventions on environmental outcomes of interest. Students will learn the research designs and methods for estimating causal effects with experimental and non-experimental data. This will prepare the students for interpreting and conducting high-quality empirical research, with applications in cross-sectional data and panel data settings.
This course will focus on ethical considerations in collecting, using, and reporting environmental data, and how to recognize and account for biases in algorithms, training data, and methodologies. Students will also examine the human and societal implications of these issues within environmental data science.
First quarter of a two-quarter group study/analysis of how to apply data science and tools to an environmental problem. In this quarter, students are expected to work with their project client to finalize project plans, assign individual roles and responsibilities, develop a project design plan and deliverables, and make significant headway on implementing those plans.
Second quarter of a two-quarter group study/analysis of how to apply data science and tools to an environmental problem. In this quarter, students are expected to complete all project plans and deliverables, develop and submit a project repository and technical documentation, give an oral defense of the project, present the research to a general audience.
Workshops to develop professional skills for careers in Environmental Data Science