Class 18

Git and GitHub

Materials for class on

2024-10-29

Relevant Reading

There was no preparation reading for today (working on hp1 instead), but these are some pertinent resources:

Agenda

Today we’ll focus on:

  • a bit more ggplot: scales, facets
  • git
  • GitHub
  • hp1 help time

Checking in about Homework Project 1

Poll

Do you need in-class help with Homework Project 1?

  1. yes
  2. no
  3. maybe

Prepping for learning git/GitHub

Poll

Have you committed git changes to a repository?

  1. yes
  2. no
Poll

Do you regularly push changes to github?

  1. yes
  2. no

What is Git?

Git is a software tool for version control. It has emerged as the most popular choice among a field of other similar tools such as RCS, CVS, SVN, and more. You can use it to track and manage changes in your personal projects, but it is also used to collaborate with others. It’s easiest to get started working on it solo, and then, if needed, learn the more advanced features that support multi-user use.

Repeat: working with Git alone is MUCH easier than working on a complex collaborative project. We’re only going to do the beginner level for this class!

Git is organized around the concept of repositories, which are just collections of files, usually organized together for a particular purpose or project. When working with R, it makes sense to think of each R project as its own repository. The abbreviation repo is commonly used.

Git works locally on your computer to keep track of changes in the files in your repositories. It logs these changes in dedicated files that are saved within the repository. Following Jenny Bryan’s sage advice, we are going to learn by usage and example before getting into too much technical detail.

However, to give a quick preview of what I mean by tracking your changes, one way you can see this is by looking at a “diff”. In this image, you can see that Git is showing some additions to the version of the file on the right:

diff for changes made to a quarto yaml file

This is showing the change when I added course materials to the dropdown menu of a previous course website by adding two lines (and changing the position of the dash on line 89).

This is the kind of thing Git is watching out for - changes in lines of text.

What is GitHub?

GitHub is essentially a cloud for your Git repositories. You can keep your repositories private (requiring specific logins) or public. You can also view Git information via the web interface in a relatively user-friendly way. You can even perform Git operations (such as “commit”) directly on the website.

Aside from software like R packages, many researchers use GitHub to share their analysis code as well. Here are some public examples that include R analyses:

GitHub also provides a relatively easy, free, and convenient (for Git users) way to publish/host webpages! For example, Quarto-based websites. Like this website! Here are some other sites and projects that I host on GitHub:

As you can see, some look like a whole domain, whereas others look like a folder/directory. You can do either depending on the project/needs.

GitHub also adds tools like Issues and Discussions which are usually used more for software development than analysis code, but can be used for issues with websites as well.

Setting it all up

We’re going to work on the following steps to get git and GitHub Desktop set up on your system. If you already have a workflow for these, you don’t need to set up a different one, and you can work on Homework Project 1 or start the work for next class!

  1. Create a GitHub account - guide
  2. Install git - guide with links
  3. Install GitHub Desktop client
  4. Authenticate GitHub Desktop (via website login) - guide

After these steps, you should be able to start working with existing repositories and create new ones, which we’ll work on more next class.

HP1 Questions

What do you need help with?