Class 18
Git and GitHub
Relevant Reading
There was no preparation reading for today (working on hp1 instead), but these are some pertinent resources:
Agenda
Today we’ll focus on:
- a bit more ggplot: scales, facets
- git
- GitHub
- hp1 help time
Checking in about Homework Project 1
Do you need in-class help with Homework Project 1?
- yes
- no
- maybe
Prepping for learning git/GitHub
Have you committed git changes to a repository?
- yes
- no
Do you regularly push changes to github?
- yes
- no
What is Git?
Git is a software tool for version control. It has emerged as the most popular choice among a field of other similar tools such as RCS, CVS, SVN, and more. You can use it to track and manage changes in your personal projects, but it is also used to collaborate with others. It’s easiest to get started working on it solo, and then, if needed, learn the more advanced features that support multi-user use.
Repeat: working with Git alone is MUCH easier than working on a complex collaborative project. We’re only going to do the beginner level for this class!
Git is organized around the concept of repositories, which are just collections of files, usually organized together for a particular purpose or project. When working with R, it makes sense to think of each R project as its own repository. The abbreviation repo is commonly used.
Git works locally on your computer to keep track of changes in the files in your repositories. It logs these changes in dedicated files that are saved within the repository. Following Jenny Bryan’s sage advice, we are going to learn by usage and example before getting into too much technical detail.
However, to give a quick preview of what I mean by tracking your changes, one way you can see this is by looking at a “diff”. In this image, you can see that Git is showing some additions to the version of the file on the right:

This is showing the change when I added course materials to the dropdown menu of a previous course website by adding two lines (and changing the position of the dash on line 89).
This is the kind of thing Git is watching out for - changes in lines of text.
What is GitHub?
GitHub is essentially a cloud for your Git repositories. You can keep your repositories private (requiring specific logins) or public. You can also view Git information via the web interface in a relatively user-friendly way. You can even perform Git operations (such as “commit”) directly on the website.
Aside from software like R packages, many researchers use GitHub to share their analysis code as well. Here are some public examples that include R analyses:
GitHub also provides a relatively easy, free, and convenient (for Git users) way to publish/host webpages! For example, Quarto-based websites. Like this website! Here are some other sites and projects that I host on GitHub:
- random demo I wanted to share quickly
- another quick simple page
- my “personal” site
- my lab site
- Human Sentence Processing 2024
As you can see, some look like a whole domain, whereas others look like a folder/directory. You can do either depending on the project/needs.
GitHub also adds tools like Issues and Discussions which are usually used more for software development than analysis code, but can be used for issues with websites as well.
Setting it all up
We’re going to work on the following steps to get git and GitHub Desktop set up on your system. If you already have a workflow for these, you don’t need to set up a different one, and you can work on Homework Project 1 or start the work for next class!
- Create a GitHub account - guide
- Install git - guide with links
- Install GitHub Desktop client
- Authenticate GitHub Desktop (via website login) - guide
After these steps, you should be able to start working with existing repositories and create new ones, which we’ll work on more next class.
HP1 Questions
What do you need help with?