Using Version Control with RStudio
Version control helps software teams manage changes to source code over time. Version control software keeps track of every modification to the code in a special kind of database. If a mistake is made, developers can turn back the clock and compare earlier versions of the code to help fix the mistake while minimizing disruption to all team members. Version control systems have been around for a long time but continue to increase in popularity with data science workflows.
The RStudio IDE has integrated support for version control. If you are new to version control, check out our book, video tutorial, and explanation:
- RStudio Essentials: Version Control
- Happy Git and Github for the useR
- Best practices: Git and Github
Version control is an indispensable tool for coordinating the work of teams and also has many benefits for individual work. The following StackOverflow discussions describe some of these benefits:
RStudio supports the following open source version control systems:
To use version control with RStudio, you should first ensure that you have installed Git and/or Subversion tools on your workstation (details below).
Version control is most useful when used with a remote repository. Remote repositories are typically managed by your company or are hosted in the cloud (e.g. Github). Make sure you have credentials to access these systems. If you only want to learn how to use version control, you can manage a standalone system on your workstation but you will not be able to share code with others.
You should also become familiar with using RStudio Projects (which are required for version control features to be enabled).
Prior to using RStudio's version control features you will need to ensure that you have Git and/or Subversion installed on your system. The following describes how to do this for various platforms.
Prior to using Git with RStudio you should install it using the appropriate method for your platform:
- Windows & OS X: http://git-scm.com/downloads
sudo apt-get install git-core
sudo yum install git-core
Prior to using Subversion with RStudio you should install it using the appropriate method for your platform:
- Windows: SilkSVN (or any of the other packages listed here)
- OSX (≤ v10.7): Not required — already included in versions of OSX 10.7 and before
- OSX (v10.8+): Install Xcode's Command Line Tools from Apple's developer downloads (requires free Apple Developer ID)
sudo apt-get install subversion
sudo yum install subversion
An excellent resource for learning more about Subversion and how to use it is the Red Bean online book.
Once you've installed your preferred Version Control system, you'll need to activate it on your system by following these steps:
- Go to Global Options (from the Tools menu)
- Click Git/SVN
- Click Enable version control interface for RStudio projects
- If necessary, enter the path for your Git or SVN executable where provided. You can also create or add your RSA key for SSH if necessary.
RStudio's version control features are tied to the use of Projects (which are a way of dividing work into multiple contexts, each with their own working directory).The steps required to use version control with a project vary depending on whether the project is new or existing as well as whether it is already under version control.
Using a directory already under version control
If you have an existing directory which is already under Git or Subversion version control then you simply need to create a new RStudio project for that directory and then version control features will be automatically enabled. To do this:
- Execute the New Project command (from the Project menu)
- Choose to create a new project from an Existing Directory
- Select the appropriate directory and then click Create Project
A new project will be created for the directory and RStudio's version control features will then be available for that directory.
Creating a new project based on a remote Git or Subversion repository
If you have an existing remote Git or Subversion repository that you want to use as the basis for an RStudio project you should:
- Execute the New Project command (from the Project menu)
- Choose to create a new project from Version Control
- Choose Git or Subversion as appropriate
- Provide the repository URL (and other appropriate options) and then click Create Project
The remote repository will be cloned into the specified directory and RStudio's version control features will then be available for that directory.
Adding version control to an existing project
Directions for remote repositories
Subversion directories are always paired with an external repository, so cannot be configured for version control without also configuring the external connection. In addition, we recommend that Git repositories always be configured with a remote repository in order to protect your data and maintain a separate backup.
If you have an existing directory that you want to add version control to, you should consult the documentation for Git or Subversion concerning how to initialize a repository (both local commands as well as commands required to connect it to a remote server). See the resources linked above for more on connecting your project to a remote repository.
Once you've configured your project with your repository, RStudio will detect that the project has been added and RStudio's version control features will then be available for that directory.
Directions for Git (local)
Git repositories can be created in purely local mode (if for example you want to track changes locally but aren't concerned with collaborating and/or syncing between multiple workstations). To add a git repository to an existing project:
- Execute the Project Options command (from the Project menu)
- Choose Version Control options
- Change the version control system from (None) to Git
- Confirm that you wish to initialize a new Git repository
A Git repository will be created for the project and you'll be prompted to restart RStudio to enable version control features for the project.
Using the Shell
RStudio provides an interface to the most common version control operations including managing changelists, diffing files, committing, and viewing history. While these features cover basic everyday use of Git and Subversion, you may also occasionally need to use the system shell to access all of their underlying functionality.
RStudio includes functionality to make it more straightforward to use the shell with projects under version control. This includes:
- On all platforms, you can use the Terminal to open a new system shell with the working directory already initialized to your project's root directory.
- On Windows when using Git, the Shell command will open Git Bash, which is a port of the bash shell to Windows specially configured for use with Msys Git (note you can disable this behavior and use the standard Windows command prompt instead using Options -> Version Control).
- On Windows when using Subversion, RStudio opens a shell with a PATH configured to use a version of ssh.exe which ships with RStudio (required for svn+ssh connections, see below).
- When running over the web, RStudio provides a web-based shell dialog.
Version control repositories can typically be accessed using a variety of protocols (including http and https). Many repositories can also be accessed using SSH (this is the mode of connection for many hosting services including GitHub and R-Forge).
In many cases the authentication for an SSH connection is done using public/private RSA key pairs. This type of authentication requires two steps:
- Generate a public/private key pair
- Provide the public key to the hosting provider (e.g. GitHub or R-Forge)
To make working with RSA key pairs more straightforward the RStudio Version Control options panel can be used to both create new RSA public/private key pairs as well as view and copy the current RSA public key.
While Linux and Mac OSX both include SSH as part of the base system, Windows does not. As a result the standard Windows distribution of Git (Msysgit, referenced above) also includes an SSH client.
Subversion for Windows however does not include an SSH client. To overcome this limitation, RStudio includes a version of the Msys SSH client within the RStudio\bin\msys_ssh directory. This directory is automatically added to the PATH (for RStudio only rather than system-wide) and is also available on the PATH for command prompt windows opened using the Tools -> Shell command. A Windows shortcut (SSH Command Prompt) is also provided within the RStudio\bin\msys_ssh directory if you wish to launch a console from the Desktop that supports the svn+ssh protocol.
RStudio Pro customers may open a discussion with RStudio Support at any time.
You may also ask for help from R and RStudio users on community.rstudio.com. Be sure to include a reproducible example of your issue. Click here to start a new community discussion.