If you are administering RStudio Connect in an offline environment, you’ll need to follow certain steps to ensure the R packages used by your team are available in Connect. These steps are different from the process you might have used in the past to provide packages to RStudio.
Package Library vs Package Repository
A repository is a directory containing uninstalled R source files or platform-specific binaries. A repository contains a
PACKAGES file with important information about the repository’s content.
A library is a directory containing installed R packages.
Though a repository and a library look very similar, they are two distinct entities.
Usually, two steps are required to use an R package:
- The package is installed from a repository into a library using
- The package is loaded from the library for an analysis using
In offline RStudio environments, it is common for administers to skip setting up a local package repository, and instead install packages directly from an online CRAN repository into a system library. The system library - a set of folders - is then moved to the offline environment or located on shared storage. R users access packages from the system library using the
library() function. The administrator tells R to look in the correct directory by defining
R_LIBS_SITE in the
Renviron.site file or using the R function
.libPaths() in the
Offline use of RStudio Connect, however, requires admins to set up a package repository. This requirement is necessary to enable Connect to manage and isolate package dependencies for deployed content. In brief, Connect installs packages from the local repository into private libraries for each piece of deployed content. Individual libraries guarantee that the content will have the correct packages, even if other content on the server requires a different version of the same package. (In practice, the process is optimized to cache packages while guaranteeing the correct versions are always available).
In environments with both RStudio and RStudio Connect, a local package repository should be used for both. Specifically, users or administrators should install packages in RStudio Server’s system library from the local repository. Installing packages into the system library from a different repository can result in a mismatch between Connect and RStudio that will cause deployment failures.
Setting up a local repository
A package repository is a set of source files arranged inside of a specific folder hierarchy. It is possible to set up the scaffolding for a repository manually, but an easier approach is to use the
packrat R packages.
From within an online environment:
miniCRANby running the R command:
Create a package repository by running:
Create a list of R packages to add to the local repository:
pkgs <- c('<YOUR_PACKAGES_HERE>', '<EACH_IN_QUOTES>', '<SEPERATED BY SPACES>')
Add the packages to the repository and update the
miniCRAN::addPackage(pkgs, path = 'path/to/directory/repoName')
Step 5 will copy all the packages, and their dependencies, into the local repository. This can take some time. At the end,
path/to/directory/repoName should include a tar file for each R package and a PACKAGES file listing information on each package. The tar files will be located in:
Next, the entire directory should be copied to a location accessible by the offline R environment (RStudio and RStudio Connect).
What if I already have a system library?
If you currently have a system library, it is important to ensure compatibility between the local repository and the packages currently in-use by RStudio. To do so:
Follow steps 1-3 above.
Enumerate the packages currently available by using:
pkgs <- as.data.frame(installed.packages())
Add the version of each package available in the system library to the repository:
miniCRAN::addOldPackage(pkgs$Package, path = 'path/to/directory/repoName', vers = pkgs$Version, deps = FALSE)
Maintaining a Repository
To add a package to the local repository, the same miniCRAN function
addPackage is used. After each addition, the repository should be copied to the offline location.
addPackage function will automatically handle package versions. For example, say you install version 1.0 of a package. Later version 2.0 of the package is released. Running
addPackage again will install version 2.0 alongside of version 1.0. By default,
install.packages uses the latest version. Analysts can manually install older versions into RStudio using
devtools::install_version. RStudio Connect automatically installs the appropriate version based on the version in-use by RStudio during deployment.
Telling R about the System Library (Required Step!)
To use the local package repository you have to tell R where the repository lives. This declaration is similar to setting
R_LIBS_SITE or modifying
To do so, run the R code:
packrat::repos_add(repoName = "file://path/to/directory/repoName/")
If the server is offline, it is often useful to remove the default
The packrat line of code is a wrapper around R’s
r <- getOption("repos") r["repoName"] <- "file://path/to/directory/repoName/" options(repos = r)
In short, add the following lines of code to the
packrat::repos_add(repoName = "file://path/to/directory/repoName/") packrat::repos_remove('CRAN')