Load Balanced
Posit Connect can be configured to run on Azure in a load balanced, high availability (HA) cluster configuration. This architecture is designed to provide high availability and fault tolerance for Connect, ensuring that the service remains available even in the event of a failure.
This architecture is best used when at least one of the following apply:
- There are internal high availability requirements
- Cost optimization is a priority, while still maintaining a resilient architecture
Architectural overview
This implementation of Posit Connect utilizes a High Availability configuration and includes the following components:
- Azure Application Gateway to route requests to the Connect instances.
- Two Azure virtual machines instances running Connect in a High Availability configuration.
- Azure Database for PostgreSQL, serving as the application database for Connect.
- Azure NetApp Files, a networked file system used to store file data, which is mounted across the Connect services.
Architecture diagram
Nodes
This architecture utilizes a high availability configuration with two virtual machines instances running Posit Connect. During our performance tests, we used two Standard_D8_v5 instances running Ubuntu 22.04.
The virtual machine instances in an HA configuration require the following configuration:
- Matching versions of Posit Connect.
- Shared configuration file for every node.
- All the necessary versions of Python, R, and Quarto.
For detailed instructions on setting up this configuration, refer to the HA checklist in the Connect Admin Guide: HA Checklist.
Database
This architecture utilizes an Azure Database for PostgreSQL instance with PostgreSQL running on a Standard_D4ds_v4
instance, provisioned with a minimum of 64 GB of storage and running the latest minor version of PostgreSQL 16 (see supported versions). Both the instance type and the storage can be scaled up for more demanding workloads.
- The database instance should be configured with an empty PostgreSQL database for the Connect metadata.
Storage
This architecture utilizes an encrypted Azure NetApp Files. NetApp Files is configured with a 2 TiB capacity pool and a 100 GiB volume.
Load balancer
This architecture utilizes an Azure Application Gateway in order to provide public ingress and load balancing to the Connect instances.
- The Application Gateway must be configured with cookie-based affinity
- You must configure health checks to ensure that the Application Gateway routes traffic only to healthy nodes. This can be done by following the instructions in the Posit Connect Admin Guide.
Networking
The architecture is implemented in a virtual network, utilizing both public and private subnets across multiple availability zones. This setup ensures high availability and fault tolerance for all deployed resources. The database instance, NetApp Files, and the virtual machine instances are located within the private subnets and ingress to the virtual machines is managed through an Application Gateway.
Configuration details
The required configuration details are outlined in multi-server installation steps. More information on running Posit Connect behind a proxy can be found in the Running with a Proxy page.
Resiliency and availability
This implementation of Connect is resilient to within-AZ failures. With two nodes of Connect, a failure in either node results in disruption to user sessions on the failed node, but does not result in overall service downtime.
We recommend using proper backup and disaster recovery procedures with the database and Azure NetApp Files instances of the cluster.
Performance
The Connect team conducts performance testing on this architecture using the Grafana k6 tool. The workload consists of one virtual user (VU) publishing an R-based Plumber application repeatedly, while other VUs are making API fetch requests to a Python-based Flask application (using jumpstart examples included in the product).
The first test is a scalability test, where the number of VUs fetching the Flask app is increased steadily until the throughput is maximized. After noting the number of VUs needed to saturate the server, a second “load” test is run with that same number of VUs for 30 minutes, to accurately measure request latency when the server is fully utilized.
Below are the results for the load test:
- Average requests per second: 2716 rps
- Average request latency (fetch): 134 ms
- Number of VUs: 400
- Error rate: 0%
(NOTE that k6 VUs are not equivalent to real-world users, as they were being run without sleeps, to maximize throughput. To approximate the number of real-world users, you could multiply the RPS by 10).
Please note that applications performing complex processing tasks will likely require nodes with larger amounts of CPU and RAM to perform that processing, in order to achieve the same throughput and latency results above. We suggest executing performance tests on your applications to accurately determine hardware requirements.
FAQ
See the Architecture FAQs page for the general FAQ.