Support

Understanding the RStudio Connect Database

Follow

RStudio Connect relies on a database to store metadata. Out of the box, RStudio Connect comes with a SQLite database. If you are running RStudio Connect on a single server, you don't need anything else.

RSC-Data-Storage-Requirements.001.png

 

However, if you are running RStudio Connect on multiple servers, you will need to provide an external Postgres installation. RStudio Connect will manage all of the data and tables inside the Postgres installation.


RSC-Data-Storage-Requirements.002.png

 

Common Questions:

Can I use an existing Postgres installation for RStudio Connect?

Yes! RStudio Connect requires read/write access to a dedicated Postgres schema, but the schema can live in a Postgres installation that houses other schemas as well.

Can I use a different database provider like Oracle, MySQL, or SQL Server?

No, not at this time.

Do I need a dedicated DBA for the RStudio Connect database?

No. RStudio Connect manages all of the data inside the database including data permissions. A DBA can assist with the initial setup and potentially data backups, but consider this database an application requirement not a part of your data organization.

What is stored in the database?

The RStudio Connect database stores metadata about RStudio Connect's content, users, and settings. The RStudio Connect database also stores usage metrics, application logs, and schedules.

The RStudio Connect database does not store the data used by the applications or reports hosted on RStudio Connect. For example, if you have a dashboard that shows sales forecasting data, that data is accessed by the application code and references your company data warehouse. The RStudio Connect database would not contain any sales data.


How big is the database?

The size of the database will depend on the amount of content and activity on the server. A good rule of thumb is to start with 1 GB of storage for a Postgres installation or 1 GB of disk space for the SQLite database (located at /var/lib/rstudio-connect/db by default).

 

Can I migrate from a single-node RStudio Connect server using SQLite to a multi-node RStudio Connect configuration with a Postgres installation?

Yes, see the admin guide chapter on migrations.

How should I handle data backups for the RStudio Connect database?

SQLite: RStudio Connect has built-in support for backing up the SQLite database while the RStudio Connect service is running, see the admin guide.

Postgres: Postgres has native support for backups. For example, a cron job can be set to use the pgdump command to create backups on a schedule.

RStudio Connect also relies on disk storage. The RStudio Connect database and disk storage should be kept in sync and backed up together at the same time.

What about the file storage requirements?

In addition to the database, RStudio Connect requires file storage that includes the source code deployed to RStudio Connect, rendered reports, log files,  and R packages. The admin guide outlines the on-disk storage requirements.

 

For a comprehensive overview, please see the admin guide chapter on databases.

Comments