Support

Install and Configure PostgreSQL for RStudio Server Pro

Follow

Who is this article for?

System Administrators who are unfamiliar with basic installation and configuration of the PostgreSQL database engine for RStudio Server Pro.

Note: The official name is "PostgreSQL" but often shortened to "Postgres".

This article provides a primer for getting started but is not intended to be a comprehensive explanation of best practices for configuring and operating Postgres in a production environment.

When is this needed?

RStudio Server Pro 1.4 (and above) uses a database for internal purposes. By default, RStudio uses SQLite, installing the necessary dependencies and automatically creating the database. 

When using RStudio's internal load balancer, however, a PostgreSQL database must be created and configured manually by the administrator as mentioned in chapter 15 of the Admin Guide.

Note: RStudio can use the PostgreSQL option without load-balancing enabled, which is helpful for configuration experimentation as described in this article.

Check for PostgreSQL

From a terminal, check if PostgreSQL is already installed and display the version.

psql --version

RStudio requires version 9.5 or above, so if an older version is found you'll need to upgrade it or install multiple versions (both beyond the scope of this article). If the psql command isn't found, then you'll need to install PostgreSQL.

Install PostgreSQL

Although many Linux distributions include a version of PostgreSQL in their package libraries, for this article we will use the instructions on the PostgreSQL website to download the most recent version: https://www.postgresql.org/download/. Select your platform, OS version, and desired PostgreSQL version, and matching installation commands will be provided.

Note: Steps in this article were tested using PostgreSQL version 13, the most recent at time of writing, on Red Hat/CentOS 7. Some details will vary on other database versions and Linux distributions.

Database Command Line Basics

After installation using recommended commands from the download page, only the "postgres" user can connect to the database engine. Connect as follows from a terminal on the same machine where you installed PostgreSQL:

sudo -i -u postgres psql

This puts you at the psql prompt. From here you can execute SQL commands against the active database and see the results, or use internal commands (often starting with a backslash).

For example, to see a list of "roles" (database user accounts in Postgres terminology), use the "\du" command:

\du

To view all databases:

\l

To exit back to your shell, use Ctrl+D, or "\q".

\q

Create Database Role (aka User) and Database

Using the default postgres role for RStudio's internal database is not recommended.

The first command below creates a new role named "rstudio", password "test", with ability to login and create databases.

Second command creates an empty database named "rstudiodb" owned by "rstudio". Punctuation and capitalization are significant in both commands.

sudo -i -u postgres psql -c "CREATE ROLE rstudio CREATEDB LOGIN PASSWORD 'test';"
sudo -i -u postgres psql -c "CREATE DATABASE rstudiodb WITH OWNER = rstudio;"

To confirm successful creation of database "rstudiodb" owned by "rstudio", list the databases and look for the one just created.

sudo -i -u postgres psql -c "\l"

Configure Connectivity

PostgreSQL's default security settings may require adjustment before the account just created can be used to connect to PostgreSQL on localhost. To see if this is the case, attempt to connect (enter the password used above, e.g. "test", when prompted):

psql -U rstudio -W rstudiodb

If this works and you are placed at the PostgreSQL prompt, proceed to next section. If you receive an error such as below, then continue this section to fix it.

psql: error: FATAL: Peer authentication failed for user "rstudio"

Connectivity is configured via the pg_hba.conf file whose location is dependent on the PostgreSQL version, and locked down and only visible and editable for the user "postgres". Open pg_hba.conf in a text editor, for example:

sudo vim /var/lib/pgsql/13/data/pg_hba.conf

For our scenario, where we are going to configure RStudio to communicate with PostgreSQL on the same machine, the relevant line is the entry for "local all". Change it from using the method "peer" to either scram-sha-256 or md5 (depends on version, use same value as other existing entries). For PostgreSQL 13:

# TYPE  DATABASE        USER   ADDRESS        METHOD
local all all scram-sha-256
host all all 127.0.0.1/32 scram-sha-256
host all all ::1/128 scram-sha-256
local replication all peer
host replication all 127.0.0.1/32 scram-sha-256
host replication all ::1/128 scram-sha-256

Save changes then restart the database (version and OS-specific):

sudo systemctl restart postgresql-13

Try the connection again, it should now work.

psql -U rstudio -W rstudiodb

Configure RStudio

After installing RStudio Server Pro, edit the /etc/rstudio/database.conf file as follows:

provider=postgresql
database=rstudiodb
username=rstudio
password=test

Save, and then stop RStudio Server and verify the configuration:

sudo rstudio-server stop
sudo rstudio-server verify-installation

You should receive a warning about using a plain-text password (as expected); see the Admin Guide for details on how to eliminate this warning via password encryption.

Next Steps

  • More background information on RStudio product databases is available in Understanding the RStudio Product Databases.
  • Consult the RStudio Admin Guide for additional configuration choices and considerations such as password encryption.
  • Consult the PostgreSQL documentation for details on configuring connectivity to PostgreSQL when it is running on a separate machine, as will be the case in a real load-balanced RStudio Server Pro configuration.

Conclusion

In this article we learned basic installation and configuration of PostgreSQL for use by RStudio Server Pro's internal database storage.

Comments