9 Version Control, Git and Github
- Can I go back to before I made that mistake?
- Can others see changes others have made to the project and can I see theirs?
Version control is essential to long-term project management and collaboration. We primarily use git for this - we recommend it for any project with more than one file of code. It has a steep learning curve but is very powerful.
GitHub is a web service for sharing git-versioned projects that has many great tools for collaboration. We have an organizational GitHub account so we can have private repositories and work in teams with shared projects.
For projects with little code-based work, there are other options, as well:
- Google Docs/Word Track Changes are limited to single documents
- Dropbox can track all files in a shared project/folder
- Allows one to view/revert to any previous version of a file in the folder
- Easily sharable
- Does not travel well - history is lost when project moves elsewhere
- File histories are independent - does not track interrelated changes.
Avoid filename-based version control:
9.1 Learn
Git has a steep learning curve and we recommend you spend some time learning rather than only trying to pick it up as you go along.
- Work through the online book Happy Git with R, which will help you get git and GitHub setup and working with R and RStudio, and teach you some basic workflows.
- Take the Software Carpentry course Version Control with Git (free) to reinforce some key concepts and learn how work with versions on a day-to-day basis.
9.2 Install
- Go through the installation steps Happy Git with R’s “Installation” and “Connect” chapters and Appendix B
- Note when setting up your GitHub account that one account can have multiple e-mail addresses associated with it, so you can split your work and personal stuff without needing multiple accounts (see here).
- Give Noam (Modeling & Analytics),the Data Librarian or the Infrastructure Lead your GitHub username so they can make you a member of the organizational EHA account and be given access to the appropriate teams.
- If you are using a GitHub account you previously created with another e-mail, be sure to add your EHA e-mail under Email Settings and set “Custom Routing” under your notification settings so that notifications related to the EHA organization go to your EHA e-mail.
- Install Dropbox on your computer with your EHA account (note you can have separate personal and EHA Dropbox folders)
- Check that your EHA email gives you access to Google Drive. If you prefer it, or your supervisor specifies it, install it locally on your computer
##. Personal Access Tokens
Github requires you to use personal access tokens (PATs) when authenticating (signing in).
These tokens provide specific levels of access to repositories and a user’s github
account. The usethis
package has a good overview of managing your PATs and some
helpful functions that will be highlighted below.
Good practices for managing GITHUB PATS: 1) Set an expiration date - github will notify you about expiration 2) Give them an informative name 3) Give minimal permissions - tokens should be able to do no more than necessary 4) Describe the token’s purpose in the notes field 5) Save the PAT in an encrypted file storage system - password managers like 1Password or Bitwarden are a good option
9.2.1 Creating Tokens
There are a number of ways to create tokens, see github docs for a comprehensive review.
You can create tokens using the function usethis::create_github_token()
.
This function will open the token management page on github and pre-select
a sensible set of permissions.
The next easiest way to generate tokens is to navigate to the tokens management page and click generate token >> classic token. There you will be presented with a number of options for permissions. You will probably want to select the “repo” option so that you can read and write to a repository.
Historically, Github PATs have been stored in the .Renviron/.env file with the variable
GITHUB_PAT. The usethis
, gert
,and gh
packages are moving towards a more secure method for
storing credentials. See storing git credentials for more information.
9.2.2 Regenerating Tokens
When your token is about to expire, you will get a notification from github. That’s your cue to go to the tokens management page, select the appropriate token, and then click “regenerate token”. Copy and paste that token into your secure storage location and then update your git credentials in R.
See PAT maintenance
9.2.3 Useful References
- Atlassian GIT Tutorials - Nice diagrams and plan language descriptions
- GIT CLI Cheatsheet
- GIT Ignore reference
- Github Learning Resources
- Git Tower cheat sheet
- Git basics overview