Keeping track of your code with Git and GitHub
By the end of this week, you should be able to:
Learning Git and GitHub may (will) be painful
analysis_final_v2_really_final_USE_THIS_ONE.R
Research that can be repeated by others using the same data and methods
Key Components:
Scenario: You’re analyzing sea surface temperature trends
Without version control:
With version control:
Folder \(\neq\) file
my_folder/
.extension
, which tells us the type of fileImportant
Good file organization is the foundation of reproducible research
Principles:
cool_project/
├── data/
│ ├── raw/ # Original, unmodified data
│ ├── processed/ # Cleaned, transformed data
| └── output/ # Output from your analyses
├── scripts/
│ ├── 01_processing/
│ ├── 02_analysis/
│ └── 03_contents/
├── results/
│ ├── figures/
│ └── tables/
├── docs/ # Documents (optional)
├── slides/ # Slides (optional)
├── coool_project.Rproj # Rstudio project (MANDATORY)
├── README.md # Project overview (MANDATORY)
└── .gitignore # Files to exclude from version control
What’s missing?
DO:
sea_surface_temp_analysis.R
snake_case
, but CamelCase
is appropriate tooDON’T:
my file.R
analysis@final.R
stuff.R
analysis_v1.2.R
= analysis_final_v2_really_final_USE_THIS_ONE.R
Sometimes: include dates (e.g. data_2024_01_15.csv
)
A system that tracks changes to files over time
Think of it as:
Git was created by Linus Torvalds (creator of Linux) in 2005
Key Features:
Repository
Clone
Commit
Push / Pull
Repository (repo): A directory containing your project and its complete history
Think of it as:
Types:
Example
mex_ports
is the local repoCloning: The process of bringing a repo from the remote to local for the first time
When you clone a repo:
Commit: A snapshot of your project at a specific point in time
Each commit contains:
Think of commits as:
Refers to uploading and downloading changes to and from the remote
Push the changes up to GitHub
This cycle creates a timeline of your project’s development
Added color to points in figure 2
is enough (one line in one file)Refactored code to work with new package
is appropriate (many lines in many files)Good commit messages:
Examples:
Add sea surface temperature data processing function
Fix bug in monthly trend calculation
Update README with installation instructions
stuff
fixed things
GitHub is a web-based platform that hosts Git repositories (including this course)
Key Features:
mex_ports
repository
Git and GitHub:
R and Git:
** Interesting reads**
Hands-on: We’ll play with Git and GitHub
Install git and create a GitHub account
Instructions available here, and under Week 3 in the course’s website
By the end of this week, you should be able to:
Due Sunday, September 14, 2025 11:59 PM via Canvas:
portfolio
)jcvdav
README.md
file.gitignore
fileGrading criteria:
Repository:
README.md
file with content (name, about you, about the project).gitignore
fileRemember: Version control is a skill that gets better with practice. Start using Git for everything!
We can do most things in RStudio, but you should know:
git init
: Create a new repositorygit clone
: Copy a repository from remote to localgit status
: See what’s changedgit add
: Stage changes for commitgit commit
: Save changes with a messagegit push
: Upload changes to remote repositorygit pull
: Download changes from remote repositoryBy yourself
In a group