Who is this article for?
System Administrators who are unfamiliar with basic installation and configuration of the PostgreSQL database engine for RStudio Workbench (previously RStudio Server Pro).
Note: The official name is "PostgreSQL" but often shortened to "Postgres".
This article provides a primer for getting started but is not intended to be a comprehensive explanation of best practices for configuring and operating Postgres in a production environment.
When is this needed?
RStudio Workbench 1.4 (and above) uses a database for internal purposes. By default, RStudio uses SQLite, installing the necessary dependencies and automatically creating the database.
When using RStudio's internal load balancer, however, a PostgreSQL database must be created and configured manually by the administrator as mentioned in the Database section of the Admin Guide.
Note: RStudio can use the PostgreSQL option without load-balancing enabled, which is helpful for configuration experimentation as described in this article.
Check for PostgreSQL
From a terminal, check if PostgreSQL is already installed and display the version.
RStudio requires version 9.5 or above, so if an older version is found you'll need to upgrade it or install multiple versions (both beyond the scope of this article). If the psql command isn't found, then you'll need to install PostgreSQL.
Although many Linux distributions include a version of PostgreSQL in their package libraries, for this article we will use the instructions on the PostgreSQL website to download the most recent version: https://www.postgresql.org/download/. Select your platform, OS version, and desired PostgreSQL version, and matching installation commands will be provided.
Note: Steps in this article were tested using PostgreSQL version 13, the most recent at time of writing, on Red Hat/CentOS 7. Some details will vary on other database versions and Linux distributions.
Database Command Line Basics
After installation using recommended commands from the download page, only the "postgres" user can connect to the database engine. Connect as follows from a terminal on the same machine where you installed PostgreSQL:
sudo -i -u postgres psql
This puts you at the psql prompt. From here you can execute SQL commands against the active database and see the results, or use internal commands (often starting with a backslash).
For example, to see a list of "roles" (database user accounts in Postgres terminology), use the "\du" command:
To view all databases:
To exit back to your shell, use Ctrl+D, or "\q".
Create Database Role (aka User) and Database
Using the default postgres role for RStudio's internal database is not recommended.
The first command below creates a new role named "rstudio", password "test", with ability to login and create databases.
Second command creates an empty database named "rstudio" owned by "rstudio". Punctuation and capitalization are significant in both commands.
sudo -i -u postgres psql -c "CREATE ROLE rstudio CREATEDB LOGIN PASSWORD 'test';"
sudo -i -u postgres psql -c "CREATE DATABASE rstudio WITH OWNER = rstudio;"
To confirm successful creation of database "rstudio" owned by "rstudio", list the databases and look for the one just created.
sudo -i -u postgres psql -c "\l"
PostgreSQL's default security settings may require adjustment before the account just created can be used to connect to PostgreSQL on localhost. To see if this is the case, attempt to connect (enter the password used above, e.g. "test", when prompted):
psql -U rstudio -W rstudio
If this works and you are placed at the PostgreSQL prompt, proceed to next section. If you receive an error such as below, then continue this section to fix it.
psql: error: FATAL: Peer authentication failed for user "rstudio"
Connectivity is configured via the pg_hba.conf file whose location is dependent on the PostgreSQL version, and locked down and only visible and editable for the user "postgres".
To locate pg_hba.conf, you can ask the database engine.
sudo -i -u postgres psql -c 'show hba_file';
Open pg_hba.conf in a text editor, for example:
sudo vim /var/lib/pgsql/13/data/pg_hba.conf
For our scenario, where we are going to configure RStudio to communicate with PostgreSQL on the same machine, the relevant line is the entry for "local all". Change it from using the method "peer" to either scram-sha-256 or md5 (depends on version, use same value as other existing entries). For PostgreSQL 13:
# TYPE DATABASE USER ADDRESS METHOD
local all all scram-sha-256
host all all 127.0.0.1/32 scram-sha-256
host all all ::1/128 scram-sha-256
local replication all peer
host replication all 127.0.0.1/32 scram-sha-256
host replication all ::1/128 scram-sha-256
Save changes then restart the database (version and OS-specific):
sudo systemctl restart postgresql-13
Try the connection again, it should now work.
psql -U rstudio -W rstudio
Note: In a production environment, you will likely want to install RStudio Workbench on a different server than PostgreSQL. For this example, however, we will install them on the same server.
Install RStudio Workbench on the same server as PostgreSQL, and edit the /etc/rstudio/database.conf file as follows:
Save, and then stop RStudio Workbench and verify the configuration:
sudo rstudio-server stop
sudo rstudio-server verify-installation
You should receive a warning about using a plain-text password (as expected); see the Admin Guide for details on how to eliminate this warning via password encryption.
- More background information on RStudio product databases is available in Understanding the RStudio Product Databases.
- Consult the RStudio Admin Guide for additional configuration choices and considerations such as password encryption.
- Consult the PostgreSQL documentation for details on configuring connectivity to PostgreSQL when it is running on a separate machine, as will be the case in a real load-balanced RStudio Workbench configuration. There will be additional configuration needed to enable connection to a Postgres database, via password-based authentication, from a separate server.
In this article we learned basic installation and configuration of PostgreSQL for use by RStudio Workbench's internal database storage.