Support

Using Python with RStudio Connect

Follow

Use Cases for Python with RStudio Connect

RStudio Connect provides functionality to publish Python content to RStudio Connect for teams and individuals who use both languages:

  • Data scientists who use a combination of R and Python
  • RStudio users working together with Jupyter Notebook users

You can use RStudio Connect along with the reticulate package to publish Jupyter Notebooks, Shiny apps, R Markdown documents, and Plumber APIs that use Python scripts and libraries.

For example, you can publish content to RStudio Connect that uses Python for interactive data exploration and data loading (pandas), visualization (matplotlib, seaborn), natural language processing (spacy, gensim), and machine learning (pytorch, scikit-learn, statsmodels).

Common use cases with Python and RStudio Connect include:

  • Push-button publishing of Jupyter Notebooks to RStudio Connect
  • Building interactive Shiny applications and dashboards on top of existing Python code and libraries
  • Using mixed Python and R content in R Markdown documents and reports
  • Scheduling Python-based data processing / ETL scripts or model training jobs
  • Using Python libraries with R Markdown notebooks for interactive and exploratory analyses
  • Deploying Plumber APIs that execute Python scripts or use Python libraries when an API is queried

 

Configuring Python with RStudio Connect

What are the requirements for using Python with RStudio Connect?

RStudio Connect (version 1.7.0 and higher) can be configured to point to one or more versions of Python on the server that will be used when Python content is published. Python support and the locations of the Python environments are specified in the RStudio Connect configuration file.

Each Python installation is required to have the pip and virtualenv Python packages installed. virtualenv is used to create project-specific environments, and pip is used to install Python packages.

The recommended approach for installing Python is to build from source, but you can point to any version or distribution of Python that meets the version and package requirements.

For more details, refer to the Python section in the RStudio Connect documentation.

The client machine that is publishing Python content should be using reticulate version 0.8.13 or newer.

 

Which versions of Python are compatible with RStudio Connect?

The minimum version of Python 2 supported in RStudio Connect is 2.7.9, and the minimum version of Python 3 supported is 3.4.0.

The reticulate package is compatible with all versions of Python >= 2.7. Integration with NumPy is optional and requires NumPy >= 1.6.

 

Can I use multiple versions of Python with RStudio Connect?

Yes, RStudio Connect supports multiple installed versions of Python on the server.

Similar to the approach for installing multiple versions of R, the recommended approach for installing multiple versions of Python is to build and install Python from source.

When a user publishes a project that uses Python, RStudio Connect will attempt to find a best match for the requested version of Python.

For more details, refer to the Python section in the RStudio Connect documentation.

 

Publishing Python Content to RStudio Connect

How do I include Python code or scripts within an R application?

The reticulate package provides a comprehensive set of tools for interoperability between Python and R. RStudio Connect supports the use of reticulate and will recreate both the R and Python packages in the environment on the RStudio Connect server when a project is deployed.

 

Where can I find examples of using Python with RStudio Connect?

Examples of Shiny apps, R Markdown documents, and Plumber APIs with Python can be found on this examples page and in the python-examples repository on GitHub.

 

How can I configure reticulate to point to a specific Python environment?

The recommended approach for configuring reticulate for use with the RStudio IDE and publishing Python content to RStudio Connect is to set the RETICULATE_PYTHON environment variable to point to the desired Python executable.

This approach is supported starting in reticulate 0.8.13 or newer versions.

For example, you can set the following in your .Rprofile:

Sys.setenv(RETICULATE_PYTHON = "/usr/local/bin/python3")

This allows you to remove any hard-coded references to Python environments from your application code and avoids the need to switch use_python or other configurations in reticulate between developing projects in RStudio and publishing projects to RStudio Connect.

For more details, refer to the Python Version Configuration section in the reticulate documentation.

 

How does RStudio Connect determine the Python packages that are used in my project?

When you publish a project that includes Python content, the RStudio IDE uses the rsconnect package as part of the standard push-button publishing process to generate a list of Python packages that are installed in the currently configured Python environment (including environments that use virtualenv or conda, assuming that all of the packages are available on PyPI and can be installed via pip).

The list of Python packages is included in the project bundle and sent to the RStudio Connect server. The Python environment is then reconstructed on the server using virtualenv and pip, and the details for the environment reconstruction process can be viewed in the deployment logs for a particular project.

The reticulate::py_config() function can be used to verify which Python executable and library paths are being used by the rsconnect package in the RStudio IDE to generate the list of Python packages.

 

Can I publish standalone Python applications such as Flask APIs to RStudio Connect?

Not at this time. Projects that include Python content must be published as a Shiny app, R Markdown document, or Plumber API using the reticulate package, or Jupyter Notebooks with Python using the rsconnect-jupyter notebook extension. If you are interested in functionality for publishing standalone Python applications to RStudio Connect, please reach out to sales@rstudio.com.

 

Publishing Jupyter Notebooks to RStudio Connect

How do I publish Jupyter Notebooks to RStudio Connect?

The functionality to publish Jupyter Notebooks to RStudio Connect is provided by a notebook extension.

Follow the steps in the rsconnect-jupyter documentation to install and configure Jupyter Notebook with the ability to publish to RStudio Connect:

  • Download and install the rsconnect-jupyter package in your Python environment
  • Enable the rsconnect-jupyter extension
  • Generate an API token from RStudio Connect
  • Use push-button publishing to publish your Jupyter Notebooks to RStudio Connect

 

Which languages / kernels can I use when publishing Jupyter Notebooks?

There are two options when publishing Jupyter Notebooks to RStudio Connect:

  • Publish document with source code
  • Publish finished document only

These options are similar to the publishing functionality of R Markdown documents.

Jupyter Notebooks that use the Python kernel can be published using either option. If you choose the option to publish the Jupyter Notebook with source code, then the Python environment will be recreated on the RStudio Connect server, including all of the Python packages, and the Jupyter Notebook will be executed on the server.

Jupyter Notebooks that use languages or kernels other than the Python kernel can be published using the finished document only.

For more details, refer to the rsconnect-jupyter documentation.

 

Can I schedule and email published Jupyter Notebooks?

Yes, Jupyter Notebooks that use the Python kernel can be published as a "document with source code", which will recreate the Python environment on the RStudio Connect server, including all of the Python packages.

These published Jupyter Notebooks can be optionally scheduled and emailed to users in RStudio Connect, similar to R Markdown reports.

 

Comments