EVR 628- Intro to Environmental Data Science
Rosenstiel School of Marine, Atmospheric, and Earth Science and Institute for Data Science and Computing
By the end of this week, you should be able to:
Learning Git and GitHub may (will) be painful
analysis_final_v2_really_final_USE_THIS_ONE.R
Research that can be repeated by others using the same data and methods
Key Components:
Scenario: You’re analyzing sea surface temperature trends
Without version control:
With version control:

Folder \(\neq\) file
my_folder/.extension, which tells us the type of fileImportant
Good file organization is the foundation of reproducible research
Principles:
cool_project/
├── data/
│ ├── raw/ # Original, unmodified data
│ ├── processed/ # Cleaned, transformed data
| └── output/ # Output from your analyses
├── scripts/
│ ├── 01_processing/
│ ├── 02_analysis/
│ └── 03_contents/
├── results/
│ ├── figures/
│ └── tables/
├── docs/ # Documents (optional)
├── slides/ # Slides (optional)
├── coool_project.Rproj # Rstudio project (MANDATORY)
├── README.md # Project overview (MANDATORY)
└── .gitignore # Files to exclude from version control

What’s missing?
DO:
sea_surface_temp_analysis.R
snake_case, but CamelCase is appropriate tooDON’T:
my file.Ranalysis@final.Rstuff.Ranalysis_v1.2.R = analysis_final_v2_really_final_USE_THIS_ONE.RSometimes: include dates (e.g. data_2024_01_15.csv)
A system that tracks changes to files over time
Think of it as:

Git was created by Linus Torvalds (creator of Linux) in 2005
Key Features:
Repository
Clone
Commit
Push / Pull
Repository (repo): A directory containing your project and its complete history
Think of it as:
Types:
Example
mex_ports is the local repoCloning: The process of bringing a repo from the remote to local for the first time
When you clone a repo:
Commit: A snapshot of your project at a specific point in time
Each commit contains:
Think of commits as:

Refers to uploading and downloading changes to and from the remote

Push the changes up to GitHub
This cycle creates a timeline of your project’s development
Added color to points in figure 2 is enough (one line in one file)Refactored code to work with new package is appropriate (many lines in many files)Good commit messages:
Examples:
Add sea surface temperature data processing functionFix bug in monthly trend calculationUpdate README with installation instructionsstufffixed thingsGitHub is a web-based platform that hosts Git repositories (including this course)
Key Features:
mex_ports repository
Git and GitHub:
R and Git:
** Interesting reads**
Hands-on: We’ll play with Git and GitHub
Install git and create a GitHub account
Instructions available here, and under Week 3 in the course’s website
By the end of this week, you should be able to:
Due Sunday, September 14, 2025 11:59 PM via Canvas:
portfolio)jcvdavREADME.md file.gitignore fileGrading criteria:
Repository:
README.md file with content (name, about you, about the project).gitignore fileRemember: Version control is a skill that gets better with practice. Start using Git for everything!
We can do most things in RStudio, but you should know:
git init: Create a new repositorygit clone: Copy a repository from remote to localgit status: See what’s changedgit add: Stage changes for commitgit commit: Save changes with a messagegit push: Upload changes to remote repositorygit pull: Download changes from remote repositoryBy yourself
In a group