EVR 528 / 628
Course description
“Introduction to Data Management and Visualization for Environmental Scientists”
is designed for students looking to gain basic data management and visualization
skills relevant to careers in environmental science and policy. The course provides
an introduction to using R and RStudio to interact with environmental data; No
coding experience is required. Students will learn highly marketable skills like
visualizing tabular and geospatial data, data management, and reproducibility.
All concepts will be introduced using real-world environmental data sets and
questions. In-class exercises and homework assignments will mimic the types of
tasks and questions that students will encounter in the workforce.
By the end of the course, students will be comfortable working in R.
Resources
Course schedule
Students must complete any assigned readings and software tests before class. Note that content, and timing of the content, is subject to change. Any changes will be clearly communicated via Canvas. (* Week contains holiday)
The latest official academic calendar is here.
Week 1 (Aug 18-22)
Introduction to Data Science and RStudio IDE
- Software installation and testing
- Overview of R Studio IDE
- R Studio Projects and project organization
- R scripts
- R packages:
tidyverse
, EVR628
Relevant links for the week
Week 2 (Aug 25-29)
Data visualization
- Types of visualization
- Visualization principles
- Colorblindness, IPCC’s visual style guide, and the
viridis
package
- The grammar of graphics and the
ggplot2
package
Relevant links for the week
*Week3 (Sep 1-5)
Keeping track of your code with Git and GitHub
- Reproducible research
- Introduction to file structure
- Version control with Git and GitHub
- Building your first repository (hello world!)
First assignment: Setting up your portfolio in GitHub
Week 4 (Sep 9-12)
Good coding principles
- Code style and documentation
- File structure and organization
- Classes, objects, variables, values
- Indexing and subsetting vectors and data frames
- Useful functions in base and stats
Week 5 (Sep 15-19)
Scaling up your code and visualizations
- Layers, geometries, and aesthetics in ggplot2
- Themes with ggplot2
- Other plotting packages (cowplot, GGally)
- Creating documents and presentations with Quarto
Second assignment: Data visualization
Week 6 (Sep 22-26)
Data management
- Reading and writing tabular data with here
- Metadata and documentation
- Retrieving environmental datasets from the Internet
- Raw data vs processed data
Week 7, Sept 29-Oct 3
Data transformation
- data.frames and tibbles in the tidyverse
- Rows (filter, arrange, distinct) in dplyr
- Columns (select, rename) in dplyr
- The pipe (native and magrittr)
- Grouping and summarizing data
Week 8 (Oct 6-10)
Data tidying and wrangling
- Principles of tidy data
- Lengthening data with tidyr
- Widening data with tidyr
- Combining multiple sources of data (*_joins)
Week 9 (Oct 13-17)
Dealing with text, dates, and factors
- Managing dates and times with lubridate
- Regular expressions with stringr
- Ordering factors with forcats
Third assignment: Data wrangling
Week 10 (Oct 20-24)
Working with spatial data in R
- Vector data and sf
- Raster data and terra
- Exploratory visualizations with plot and mapview
Week 11 (Oct 27-31)
Visualizing spatial data
- Attribute operations
- Building maps with ggplot2 and tmap
Fourth assignment: Visualizing spatial data
Week 12 (Nov 3-7)
Extensions
- Connecting to external databases with DBI
- Animating with the
gganimate
package
- Interactive maps with leaflet
Week 13, (Nov 10-14):
Programming
- User-defined functions
- Iteration with loops
- Functional programming with the purrr package
- Background jobs
- Standardizing the environment with .Rprofile
Week 14, (Nov 17-21)
Shiny Apps Framework
- Reactive programming
- UI / UX
- Front-end vs back-end
*Week 15, (Nov 24-28)
Thanksgiving recess
Week 16: (Dec 4-10)
FINAL EXAMS WEEK
Final presentations (egg timers)
What will you learn?
You will learn how to access, work with, and visualize many different types of environmental data. For example:
Reading resources
- Books and manuals:
- Ad hoc peer review literature and blog posts:
- Other relevant sources: