8 Auditing and Monitoring
8.1 Auditing Configuration
8.1.1 R Console Auditing
RStudio Workbench can be optionally configured to audit all R console activity by writing console input and output to a central location (the /var/lib/rstudio-server/audit/r-console
directory by default). This feature can be enabled using the audit-r-console
setting. For example:
/etc/rstudio/rserver.conf
audit-r-console=input
This will audit all R console input. If you wish to record both console input and output then you can use the all
setting. For example:
/etc/rstudio/rserver.conf
audit-r-console=all
Note that if you choose to record both input and output you’ll need considerably more storage available than if you record input only. See the Storage Options section below for additional discussion of storage requirements and configuration.
8.1.1.1 Data Format
The R console activity for each user is written into individual files within the r-console
data directory (by default /var/lib/rstudio-server/audit/r-console
). The following fields are included:
session_id | Unique identifier for R session where this action occurred. |
project | Path to RStudio project directory if the action occurred within a project. |
pid | Unix process ID where this console action occurred. |
username | Unix user which executed this console action. |
timestamp | Timestamp of action in milliseconds since the epoch. |
type | Console action type (prompt, input, output, or error). |
data | Console data associated with this action (e.g. output text). |
The session_id
field refers to a concurrent R session as described in the section on Multiple R Sessions (i.e. it can span multiple projects and/or pids).
The default format for the log file is CSV (Comma Separated Values). It’s also possible to write the data to Newline Delimited JSON by using the audit-r-console-format
option. For example:
audit-r-console-format=json
Note that when using the JSON format the entire file is not a valid JSON object but rather each individual line is one. This follows the Newline Delimited JSON specification supported by several libraries including the R jsonlite package.
8.1.1.2 Storage Options
You can customize both the location where audit data is written as well as the maximum amount of data to log per-user (by default this is 50 MB). To specify the root directory for audit data you use the audit-data-path
setting. For example:
/etc/rstudio/rserver.conf
audit-data-path=/audit-data
Note that this path affects the location of both R console auditing and R session auditing data.
To specify the maximum amount of data to write to an individual user’s R console log file you use the audit-r-console-user-limit-mb
setting. For example:
/etc/rstudio/rserver.conf
audit-r-console-user-limit-mb=100
The default maximum R console log file size is 50 megabytes per-user. To configure no limit to the size of files which can be written you set the value to 0
, for example:
/etc/rstudio/rserver.conf
audit-r-console-user-limit-mb=0
If you wish for RStudio to automatically roll the log files once the maximum size is reached, set the audit-r-console-user-limit-months
setting. For example:
/etc/rstudio/rserver.conf
audit-r-console-user-limit-months=2
This will cause log files to be rolled over once the maximum size is reached, and only two months of data will be kept. Note that this setting is not set by default.
Note that if the month limit is not set, then log files will not be rolled automatically. Depending on the number of users and their activity level this means that you should either create a scheduled (e.g. cron) job to periodically move the files off the server onto auxiliary storage and/or ensure that the volume they are stored on has sufficient capacity.
8.1.2 R Session Auditing
RStudio Workbench can be optionally configured to write an audit log of session related events (e.g. login/logout, session start/suspend/exit) to a central location (the /var/lib/rstudio-server/audit/r-sessions
directory by default). This feature can be enabled using the audit-r-sessions
setting. For example:
/etc/rstudio/rserver.conf
audit-r-sessions=1
Note that this is enabled by default if you are using named user licenses.
Note: Session auditing is only supported for RStudio IDE R Sessions and is not currently supported for Jupyter or VS Code sessions.
8.1.2.1 Data Format
The R session event log is written by default to the file at /var/lib/rstudio-server/audit/r-sessions/r-sessions.csv
. The following fields are included:
pid | Unix process ID the event is associated with (for auth events this will be the main rserver process, for session events the rsession process). |
username | Unix user that the event is associated with. |
timestamp | Timestamp of event in milliseconds since the epoch. |
type | Event type (see documentation on event types below). |
data | Administrative user that initiated event (only applies to admin events and auth_login for login-as-user by admin). |
The following values are valid for the event type
field:
auth_login | User logged in to RStudio Workbench |
auth_throttled | User temporarily blocked due to multiple login attempts (as defined by the option auth-sign-in-throttle-seconds ) |
auth_unlicensed | User is locked or there is no license available |
auth_license_failed | User blocked due to a failure in obtaining a license |
auth_logout | User logged out of RStudio Workbench |
auth_login_failed | User login attempt failed because a local account may not exist |
session_start | R session started |
session_suicide | R session exiting due to suicide (internal error) |
session_suspend | R session exiting due to suspend |
session_quit | R session exiting due to user quit |
session_exit | R session exited |
session_admin_suspend | Administrator attempt to suspend R session |
session_admin_terminate | Administrator attempt to terminate R session |
The default format for the log file is CSV (Comma Separated Values). It’s also possible to write the data to Newline Delimited JSON by using the audit-r-sessions-format
option. For example:
audit-r-sessions-format=json
Note that when using the JSON format the entire file is not a valid JSON object but rather each individual line is one. This follows the Newline Delimited JSON specification supported by several libraries including the R jsonlite package.
8.1.2.2 Storage Options
You can customize both the location where audit data is written as well as the maximum amount of R session event data to log (by default this is 1 GB). To specify the root directory for audit data you use the audit-data-path
setting. For example:
/etc/rstudio/rserver.conf
audit-data-path=/audit-data
Note that this path affects the location of both R console auditing and R session auditing data.
To specify the maximum amount of R session event data to log you use the audit-r-sessions-limit-mb
setting. For example:
/etc/rstudio/rserver.conf
audit-r-sessions-limit-mb=2048
The default maximum R session event log file size is 1 GB (1024 MB). To configure no limit to the size of files which can be written you set the value to 0
, for example:
/etc/rstudio/rserver.conf
audit-r-sessions-limit-mb=0
If you wish for RStudio to automatically roll the log files once the maximum size is reached, set the audit-r-sessions-limit-months
setting. The default is set to 13 months. To set it manually, for example:
/etc/rstudio/rserver.conf
audit-r-sessions-limit-months=13
This will cause log files to be rolled over once the maximum size is reached, and only thirteen months of data will be kept. We do not recommend you change this setting if using named user licenses.
Note that if the month limit is not set, then log files will not be rolled automatically. This means that you should either create a scheduled (e.g. cron) job to periodically move the file off the server onto auxiliary storage and/or ensure that the volume that it is stored on has sufficient capacity.
In any case, the amount of data written to the R session event log file is not large (less than 1 KB per session) so a large number of session events can be stored within the default 1 GB maximum log file size.
8.2 Monitoring Configuration
8.2.1 System and Per-User Resources
RStudio Workbench monitors the use of resources (CPU, memory, etc.) on both a per-user and system wide basis. By default, monitoring data is written to a set of RRD (http://oss.oetiker.ch/rrdtool/) files and can be viewed using the Administrative Dashboard.
The storage of system monitoring data requires about 20MB of disk space and the storage of user monitoring data requires about 3.5MB per user. This data is stored by default at /var/lib/rstudio-server/monitor
. If you have a large number of users you may wish to specify an alternate volume for monitoring data. You can do this using the monitor-data-path
setting. For example:
/etc/rstudio/rserver.conf
monitor-data-path=/monitor-data
You also might wish to disable monitoring with RRD entirely. You can do this using the monitor-rrd-enabled
setting. For example:
/etc/rstudio/rserver.conf
monitor-rrd-enabled=0
Note that changes to the configuration will not take effect until the server is restarted.
8.2.1.1 Analyzing RRD files
The RRD files powering RStudio’s Administrative Dashboard are available for your own analysis, too. You can find them in /var/lib/rstudio-server/monitor/rrd
(unless you’ve changed monitor-data-path
as described above); they store all the metrics you can see on the dashboard, so you can use the information for your own reports and insights.
More information on how to read and visualize RRD data from R is available in the following blog post:
8.2.2 Using Graphite
If you are managing several servers it might be convenient to send server monitoring data to a centralized database and graphing facility as opposed to local RRD files. You can do this by configuring the server to send monitoring data to Graphite (or any other engine compatible with the Carbon protocol). This can be done in addition to or entirely in place of RRD.
There are four settings that control interaction with Graphite:
monitor-graphite-enabled |
Write monitoring data to Graphite (defaults to 0 ) |
monitor-graphite-host |
Host running Graphite (defaults to 127.0.0.1 ) |
monitor-graphite-port |
Port Graphite is listening on (defaults to 2003 ) |
monitor-graphite-client-id |
Optional client ID for sender |
For example, to enable Graphite monitoring on a remote host with the default Graphite port you would use these settings:
/etc/rstudio/rserver.conf
monitor-graphite-enabled=1
monitor-graphite-host=134.47.22.6
If you are using a service like hosted graphite.com that requires that you provide an API key as part of reporting metrics you can use the monitor-graphite-client-id
setting. For example:
/etc/rstudio/rserver.conf
monitor-graphite-enabled=1
monitor-graphite-host=carbon.hostedgraphite.com
monitor-graphite-client-id=490662a4-1d8c-11e5-b06d-000c298f3d04
Note that changes to the configuration will not take effect until the server is restarted.
8.3 Server Health Checks
8.3.1 Enabling Health Checks
You may wish to periodically poll RStudio Workbench to ensure that it’s still responding to requests as well as to examine various indicators of server load. You can enable a health check endpoint using the server-health-check-enabled
setting. For example:
/etc/rstudio/rserver.conf
server-health-check-enabled=1
After restarting the server, the following health-check endpoint will be available:
http://<server-address-and-port>/health-check
By default, the output of the health check will appear as follows:
active-sessions: 1
idle-seconds: 0
cpu-percent: 0.0
memory-percent: 64.2
swap-percent: 0.0
load-average: 4.1
8.3.2 Customizing Responses
The response to the health check is determined by processing a template that includes several variables. The default template is:
active-sessions: #active-sessions#
idle-seconds: #idle-seconds#
cpu-percent: #cpu-percent#
memory-percent: #memory-percent#
swap-percent: #swap-percent#
load-average: #load-average#
You can customize this template to return an alternate format (e.g. XML or JSON) that is parse-able by an external monitoring system. To do this you simply create a template and copy it to /etc/rstudio/health-check
For example, an XML format:
/etc/rstudio/health-check
<?xml version="1.0" encoding="UTF-8"?>
<health-check>
<active-sessions>#active-sessions#</active-sessions>
<idle-seconds>#idle-seconds#</idle-seconds>
<cpu-percent>#cpu-percent#</cpu-percent>
<memory-percent>#memory-percent#</memory-percent>
<swap-percent>#swap-percent#</swap-percent>
<load-average>#load-average#</load-average>
</health-check>
Or a Prometheus endpoint. Prometheus is an open-source systems monitoring and alerting toolkit with a custom input format:
/etc/rstudio/health-check
# HELP active_sessions health_check metric Active RStudio sessions
# TYPE active_sessions gauge
active_sessions #active-sessions#
# HELP idle_seconds health_check metric Time since active RStudio sessions
# TYPE idle_seconds gauge
idle_seconds #idle-seconds#
# HELP cpu_percent health_check metric cpu (percentage)
# TYPE cpu_percent gauge
cpu_percent #cpu-percent#
# HELP memory_percent health_check metric memory used (percentage)
# TYPE memory_percent gauge
memory_percent #memory-percent#
# HELP swap_percent health_check metric swap used (percentage)
# TYPE swap_percent gauge
swap_percent #swap-percent#
# HELP load_average health_check metric cpu load average
# TYPE load_average gauge
load_average #load-average#
8.3.3 Changing the URL
It’s also possible to customize the URL used for health checks. RStudio Workbench will use the first file whose name begins with health-check
in the /etc/rstudio
directory as the template, and require that the full file name be specified in the URL. For example, a health check template located at the following path:
/etc/rstudio/health-check-B64C900E
Would be accessed using this URL:
http://<server-address-and-port>/health-check-B64C900E
Note that changes to the health check template will not take effect until the server is restarted.