AWS Single Server - Robust

Posit Workbench AWS Single Server - Robust

You can configure Posit Workbench to run on a single server using an external database and a networked file system. Both Workbench and user sessions are hosted on the same server. Using external components is optional for a single server implementation, but this approach provides a more robust architecture that is in line with the best practices and existing processes of many organizations. This architecture also offers flexibility to more easily scale up in the future without needing to migrate the database or application files.

Architectural overview

This architecture uses the following components:

Architecture diagram

Sizing and performance

The performance of this architecture is dictated by EC2 type and the CPU and RAM characteristics of the machine, and the performance of the attached RDS database and FSx file system.

A more detailed overview on node sizing can be found in the Architecture considerations section.

Node

Ensure the EC2 instance is large enough to handle peak usage. The size and type of instance depend on the needs and workloads of end users. For the best experience, estimate the number of concurrent sessions expected and the compute resources required for each session, focusing on CPU and memory. Choose an instance type that supports the maximum number of concurrent sessions at any given time. In practice, a single-server implementation works best for small teams of approximately 10 users.

CPU

Workbench alone consumes only a small amount of system resources. However, when estimating needs, reserve two to six cores for Workbench itself, in addition to accounting for the Python and R processes running in user sessions or Workbench Jobs. Review the considerations in the Number of users section to estimate the total number of CPU cores needed.

RAM

User sessions and Workbench Jobs consume memory. The number of users and how they typically work with their data dictate the amount of memory needed. Review the Memory and disk requirements section to estimate the amount of memory needed.

Database

This configuration requires a PostgreSQL database for the Workbench product database. This configuration puts little stress on the database. Up to 5,000 concurrent sessions with a db.t3.micro instance have been tested with no notable performance degradation.

Storage

For this architecture, we recommend FSx for OpenZFS or FSx for Lustre. We do not recommend Regional EFS or Single Zone EFS, as these file systems may encounter performance issues.

When considering which FSx file system to use:

  • FSx for OpenZFS offers lower latency and can be deployed in multiple Availability Zones to increase system resiliency. Because it does not support extended POSIX ACLs, FSx for OpenZFS does not support Workbench Project Sharing.
  • FSx for Lustre offers higher maximum capacity and potentially higher bandwidth, but it is not available in multiple Availability Zones. Choose FSx for Lustre if you want to use Workbench Project Sharing.

Workbench’s file system usage for configuration and state storage is modest, likely less than a few GB for a production system. Therefore, the size of your FSx volume depends almost entirely on end-user usage patterns. Some data science teams store very little in their home directories and only need a few GB per person. Other teams may download large files into their home directories and need much more. Administrators should consult with their user groups to determine the appropriate size.

FSx for OpenZFS volumes are provisioned in 1.2 TB chunks. This amount can be shared across all user home directories, Workbench configuration, and Workbench state storage.

For more information on the differences between file systems, please refer to the Cloud-specific recommentations in the Posit Team Storage documentation.

Load balancer

This architecture utilizes an AWS Application Load Balancer (ALB) in order to provide public ingress to the Workbench instance. This single server architecture does not provide redundancy, but the use of a load balancer allows additional nodes to be added later.

Configuration details

Networking

The architecture is implemented in a virtual private cloud, utilizing both public and private subnets. The RDS database instance, the FSx file systems, and the EC2 instance are located within the private subnets, and ingress to the EC2 instance is managed through an ALB.

Resiliency and availability

This architecture does not provide high availability or fault tolerance by itself. If the EC2 instance fails or there are outages within the Availability Zone, Workbench is unavailable until service is restored. Because this architecture uses an ALB, additional EC2 instances can be added to improve availability.

Unlike the normal AWS Single Server architecture, this architecture is made more robust by utilizing a Postgres database for the product metadata and a shared file system for storage. This architecture positions Workbench for easier scaling and allows for more robust backup and recovery strategies. Review the Database and Mount shared storage documentation for more information.

Back to top