Support

Process management in RStudio Connect

Follow

This article is based on this section of the administrator's guide.


RStudio Connect launches R to perform a variety of tasks. This includes:

  • Installation of R packages
  • Rendering of R Markdown documents
  • Running Shiny Applications
  • Running a Shiny application to customize a parameterized R Markdown document.

The location of R defaults to whatever is in the path. Customize the Server.RVersion setting to use a specific R installation. See this section of the admin guide for details.

Sandboxing

The RStudio Connect process runs as the root user. It needs escalated privileges to allow binding to protected ports and to create “unshare” environments that contain the R processes.

RStudio Connect runs its R processes as an unprivileged user; both a system default and content-specific overrides are supported. See this section of the admin guide for details.

The “unshare” environment created for R execution involves first establishing a number of bind mounts and then switching to the target unprivileged user. RStudio Connect uses unshare to alter the execution context available to R processes. Within this newly established environment, a number of mount calls are made in order to hide or isolate parts of the filesystem.

You can learn more about unshare here. The mount call is detailed here. Your local man pages will document their behavior specific to your system.

The following locations are masked during R execution:

  • The Server.DataDir directory containing all variable data used by RStudio Connect.
  • The Database.Dir directory, which can optionally be placed outside the data directory.
  • Configuration directories, including /etc/rstudio-connect.
  • The /tmp and /var/tmp directories.
  • The /home hierarchy.

The following information is exposed during R execution:

  • The packrat data directory (read-only except when installing packages).
  • The R data directory (only when installing packages).
  • The directory containing the unpackaged R code (Shiny and R Markdown).
  • The document rendering destination directory (only for R Markdown).
  • A per-process temporary directory (exposed over the original /tmp and /var/tmp).
  • The home directory for the Applications.RunAs user, should it exist (exposed as /home).

Shiny applications have write access to the directory containing the unpackaged R code. This application directory is the working directory when launching Shiny. Data written here will be visible to all processes associated with that Shiny application but are not visible to other R processes. Application directory data remains available until that application is next deployed to RStudio Connect. A deployment creates a new application directory containing only the deployed content.

Note: RStudio Connect may launch multiple processes to service requests for an application. There is no coordination between these processes. Shiny applications that write to local files could experience problems when different processes attempt to write to a single file. We recommend against using the file system for data persistence.

R Markdown documents have write access to the rendering destination directory and read access to a directory containing the unpackaged R code. The source directory is the working directory when calling rmarkdown::render. The destination directory is passed as the output_dir while a temporary directory is passed as the intermediates_dir. The intermediate directory is transient and not available after rendering completes. A new output directory is created whenever the document is rendered. Data created during one rendering is not visible to another.

R Markdown multi-document sites have a slightly different rendering pipeline than standalone documents. RStudio Connect uses the rmarkdown::render_site function, which does its rendering in-place. The content from the source directory is copied into the rendering destination directory in preparation for rendering. Site rendering has write access to the destination directory. Access to the original source directory is not provided because the source content is duplicated in the destination directory

The rmarkdown::render_site call usually places its output into a subdirectory (typically, ’_site’). The contents of this output subdirectory will be moved to the root of the rendering destination directory, replacing any other content. No post-rendering file movement occurs if rmarkdown::render_site is instructed to render into the current directory instead of a subdirectory. This means that both source and output files will be available for serving.

Note: We recommend against configuring rmarkdown::render_site to write its output into the current directory. Rendering the site into a subdirectory (the default) allows RStudio Connect to remove source from the output directory.

RStudio Connect serves rendered content from the document output directory. This content remains available until a subsequent rendering is successful and activated (if requested). Neither incomplete nor unsuccessful document renderings affect the availability of previously rendered content.

Path Rewriting

The sandboxing used by RStudio Connect involves bind mounts which map physical locations on disk onto different directory structures at runtime. Paths used by your R code use these sandboxed locations. If you need to find the physical file on disk, you will need to undo the path transformation.

This section gives some examples of path rewriting and offer some ways of finding the file you need.

Let’s start with an app.R file that describes a Shiny application. This file will be in the apps/XX/YY/directory underneath the Server.DataDir location. The XX and YY path components correspond to the application ID and bundle (or deployment) ID for this version of your application. This directory is available at runtime as /opt/rstudio-connect/mnt/app/.

The directory structure of /opt/rstudio-connect/mnt/ is just a number of empty directories. The “unshare” environment created during sandboxing allows RStudio Connect to associate different application directories with these mount directories.

Here are some common path transformations that may be helpful. All of the physical paths are beneath the Server.DataDir hierarchy that defaults to /var/lib/rstudio-connect. All of the sandbox paths are beneath the mount directory /opt/rstudio-connect/mnt/. This location is not customizable.

Physical path Sandbox path

DataDir/apps/XX/YY/

MountDir/app/

DataDir/reports/XX.ZZ

MountDir/report/

DataDir/R

MountDir/R

DataDir/packrat

MountDir/packrat

Here are some actual path transformations using the default Server.DataDir location:

# A source Shiny application
/var/lib/rstudio-connect/apps/4/7/app.R
    => /opt/rstudio-connect/mnt/app/app.R

# A source R Markdown document
/var/lib/rstudio-connect/apps/8/12/index.Rmd
    => /opt/rstudio-connect/mnt/app/index.Rmd

# An HTML document rendered from that R Markdown document
/var/lib/rstudio-connect/reports/8.2/index.html
    => /opt/rstudio-connect/mnt/report/index.html

# A staticly deployed document
/var/lib/rstudio-connect/apps/17/21/index.html
    => /opt/rstudio-connect/mnt/app/index.html

# The Shiny package inside the packrat cache
/var/lib/rstudio-connect/packrat/3.2.5/v2/library/shiny/
  28d6903a44dc53bd4823fa43ccdc08e5/shiny
    => /opt/rstudio-connect/mnt/packrat/3.2.5/v2/library/shiny/
         28d6903a44dc53bd4823fa43ccdc08e5/shiny

Comments