Storage
Choosing the right storage solution for Posit Team helps data scientists achieve the performance and reliability they need to support their critical research, development, and insight generation. Using the most appropriate storage solution can save your organization time, money, and frustration by providing users with a desktop-like experience and a secure, server-sized environment. Because your storage choice is essential for successfully deploying Posit Workbench, Connect, and Package Manager, Posit has tested and troubleshooted various cloud storage solutions extensively. We test solutions using a combination of artificial and real-world storage loading scenarios to capture the characteristics of the underlying cloud storage.
Relative storage requirements
The networked and local storage requirements for Workbench, Connect, and Package Manager vary by product. Workbench’s file store hosts and manages interactive user sessions that must be saved, cached, and run. Users expect performance to be similar to running in a desktop environment. Choosing an inappropriate file system negatively impacts performance and user experience. Additionally, Workbench stores session state, including suspended session data, in the user’s home directory. This data can grow quite large, depending on the user’s work.
Connect generally requires less demanding storage. While it still needs fast, responsive storage, most intensive tasks occur asynchronously during publishing. As a result, unless the application or content item performs heavy read/write operations at runtime, storage speed only slightly affects startup and loading times.
Package Manager has the least demanding storage requirements within Posit Team. The primary use of Package Manager’s storage is to locally cache packages downloaded from the Posit Sync Service and expose them for consumption by your internal data science teams. The fact that cached package data can be served directly from blob storage services, like AWS S3, further demonstrates the low threshold for storage performance.
General storage recommendations
To provide an excellent user experience with Posit products, the following sections discuss the primary factors: appropriate throughput and storage latency.
Although input/output operations per second (IOPS) typically serve as a key metric, it rarely becomes a limiting factor because cloud storage providers offer sufficient baseline IOPS for SSD/Flash storage arrays. Due to this, IOPS doesn’t impact Posit product performance even though they read and write large files and require the ability to quickly complete small file operations.
Storage throughput
Storage throughput, typically measured in MB/s, is the maximum rate at which data can move from server memory to disk, whether that disk is a local SSD or an NFS server share hosted in another country. Data science workloads typically use throughput when reading or writing large data files to disk and when Workbench suspends or resumes a user session that includes large amounts of data stored in memory. Low throughput, or slow storage, can significantly impact code run time and the time it takes to suspend and resume user sessions. Insufficient throughput on a remote NFS server can cause cascading impacts on other users’ Workbench sessions if the remote NFS server’s network throughput, where user home directories are stored, is exhausted.
To assess general throughput needs, we recommend using this formula:
Throughput-of-1-node + (Throughput-of-1-node*Number-of-Remaining-Nodes*0.1)
For example, if you are running a load-balanced Workbench cluster with three nodes, and each node is capable of 1 Gb/s throughput, we recommend that your storage solution can handle 1.2 Gb/s:
1 Gb/s + (1 Gb/s * 2 * 0.1) = 1.2 Gb/s
We expect this configuration to provide sufficient throughput for a heavy workload on a single node while ensuring the remaining nodes have the necessary throughput to handle general interactive sessions.
Storage latency
Storage latency, measured in milliseconds, represents the time it takes to complete a single storage transaction. This metric is particularly important because the system rapidly caches session state to disk to ensure that no data or inputs are lost if the browser disconnects. Slow disk latency can cause the RStudio IDE to lag and respond slowly to user input as it tries to confirm that the input is cached.
For optimal user experience, we recommend an average latency under 1 ms, measured by ioping. Latency above 2-2.5 ms may feel slow, but can still work depending on the storage solution. Latency above 2.5 ms for a simple ioping test can cause noticeable performance issues and make Workbench seem unresponsive. Administrators should avoid cloud storage solutions like Azure Files (Storage account), Azure Elastic SAN, and AWS EFS Regional, which fall into this category.
Cloud-specific recommendations
Storage Service | Workbench Home Directory | Connect Data Directory | Package Manager Cache |
---|---|---|---|
EBS Local Storage | ✅ | ✅ | ✅ |
AWS FSx for Lustre | ✅ | ✅ | ✅ |
AWS FSx for NetApp ONTAP | ✅ * | ✅ | ✅ |
AWS FSx for OpenZFS | ✅ * | ✅ | ✅ |
EFS One Zone | 🟡 * | ✅ | ✅ |
EFS Regional | 🟡 * | ✅ | ✅ |
S3 Bucket | ❌ | ❌ | ✅ |
* Doesn’t support Workbench Project Sharing
✅ Supported and satisfactory performance
🟡 Not recommended/Potential performance issues
❌ Not supported/Unacceptable performance
- This standard block storage offering provides the fastest storage experience for most Posit workloads. Unfortunately, it doesn’t support high availability or load-balanced configurations for Posit products. Additionally, backups and point-in-time recovery for EBS storage can result in lost data and/or work due to the generally unreplicated nature of the storage.
- FSx for Lustre provides a premium experience for data science users and supports the extended POSIX ACLs that Posit Workbench Project Sharing requires. Unfortunately, it doesn’t include a multi-AZ redundancy option that’s easy to configure. FSx for Lustre’s replication configuration involves a potentially complicated process involving S3 bucket replication. Review the AWS Linking your file system to an Amazon S3 bucket documentation for more information.
AWS FSx for NetApp ONTAP and AWS FSx for OpenZFS:
- Both file systems provide excellent performance characteristics for most Posit Team use cases when correctly configured. They don’t support Workbench Project Sharing because they lack support for extended POSIX ACLs. When you use them across multiple Availability Zones, be aware that multi-AZ replication can introduce significant file system latency and slow down simple tasks for data science users, such as installing R packages and creating Python virtual environments. For maximum performance, configure OpenZFS in a single AZ.
EFS One Zone and EFS Regional:
- EFS storage deployments provide a relatively inexpensive solution for Posit Team. Since they’re generally slower than other supported options, and much slower in the case of EFS Regional, the primary benefit is cost reduction. EFS doesn’t support Workbench Project Sharing because it lacks support for extended POSIX ACLs. Don’t consider EFS Regional for home directory storage for Workbench, though EFS One Zone is serviceable. The speed difference between EFS One Zone and EBS/FSx solutions directly affects user workloads and operations in Workbench, such as installing packages and creating Python virtual environments. EFS deployments generally perform well for both Posit Connect and Package Manager, provided your Connect applications are tuned to handle potentially longer startup times. Package Manager can work with slower shared file systems like EFS, but S3 is typically a better choice.
- Workbench home directories and Connect content data storage don’t support S3. You can use it as a data source for developers and deployed content. It’s a good option for Package Manager cache data.
Storage Service | Workbench Home Directory | Connect Application Directory | Package Manager Local Cache |
---|---|---|---|
Managed Premium SSD LRS | ✅ | ✅ | ✅ |
Azure NetApp Files - Premium | ✅ | ✅ | ✅ |
Azure NetApp Files - Ultra | ✅ | ✅ | ✅ |
Azure NetApp Files - Standard | 🟡 | 🟡 | ✅ |
Azure Files NFS | ❌ | 🟡 | ✅ |
Azure Elastic SAN | ❌ | ❌ | ❌ |
Azure Blob Storage | ❌ | ❌ | ❌ |
✅ Supported and satisfactory performance
🟡 Not recommended/Potential performance issues
❌ Not supported/Unacceptable performance
- This standard block storage offering provides the fastest storage experience in most Posit workloads. Unfortunately, it doesn’t support high availability or load-balanced configurations for Posit products. Additionally, backups and point-in-time recovery for Managed Disk storage can result in lost data and/or work due to the generally unreplicated nature of the storage. When data access speed is the most important factor in your storage selection process, Posit recommends SSD block storage for all Posit workloads.
Azure NetApp Files - Standard/Premium/Ultra:
- Azure NetApp Files provide an excellent experience when used as Posit storage. Posit recommends locating the NetApp storage volume in the same availability zone as your Azure VM. Azure NetApp storage throttles throughput with storage capacity and storage tier (Standard, Premium, and Ultra). Generally, Posit recommends against Azure NetApp - Standard unless your organization needs a lot of Azure NetApp storage. This is because Standard doesn’t provide much throughput per TB of storage. Use the Microsoft Azure NetApp article to compare with our suggested throughput formula above to determine your organization’s best capacity/throughput/cost configuration.
- Azure Files is a low-cost, scalable NFS file storage option in Azure. Unfortunately, it has several significant drawbacks when used with Posit Workbench and Posit Connect. Firstly, it introduces high file system latency, resulting in a poor experience for most Posit data science workloads. This high latency can also result in a higher than expected occurrence of orphaned .
.nfs12345678
files, which can lock/abort application workloads. These leftover files result from the NFSv3 Silly Rename process. Additionally, Azure Files suffers from known file system latency issues on metadata operations, further compounding its unsuitability for workloads that include many small files, such as R and Python package installation and restoration. Reference Azure’s Troubleshoot Azure Files performance issues documentation for additional information.
Our testing revealed inconsistent performance with Azure Elastic SAN. While you can tune it to potentially become an excellent storage solution, several factors suggest that it’s generally not recommended unless your organization is already using Elastic SAN and can’t switch to other options. The complexity and effort required, combined with the availability of more suitable and easier-to-support options like Managed Disk for local storage and Azure NetApp Files for networked storage, make this method unsuitable as a recommended solution.
- Workbench home directories, Connect content data, and Package Manager cache data don’t support Azure Blob Storage. It can be used as a data source for developers and deployed content.
Storage Service | Workbench Home Directory | Connect Application Directory | Package Manager Local Cache |
---|---|---|---|
SSD Persistent Disk | ✅ | ✅ | ✅ |
Google File Store - Basic SSD | ✅ | ✅ | ✅ |
Google File Store - Zonal SSD | 🟡 | ✅ | ✅ |
Google File Store - Enterprise SSD | 🟡 | ✅ | ✅ |
Google Cloud Storage | ❌ | ❌ | ❌ |
✅ Supported and satisfactory performance
🟡 Not recommended/Potential performance issues
❌ Not supported/Unacceptable performance
- This standard block storage offering provides the fastest storage experience for most Posit workloads. Unfortunately, it doesn’t support high availability or load-balanced configurations for Posit products. Additionally, backups and point-in-time recovery for Managed Disk storage can result in lost data or work due to the generally unreplicated nature of the storage. When data access speed is the most important factor in your storage selection, Posit recommends using SSD block storage for all Posit workloads.
- Google File Store Basic is one of the core NFS file server options available in GCP and the primary recommendation for Posit products in GCP. However, our testing shows that it’s slower than other AWS and Azure cloud storage solutions, especially for single VM workloads. Google File Store Zonal and Enterprise SSDs offer higher throughput but higher latency, likely due to their HA/replicated configuration. Basic SSD provides faster write times for small files, like installing R and Python packages, but slower performance for large file installations due to lower throughput. Posit recommends sizing your Google File Store deployment to balance throughput and latency for optimal user performance.
- Google Cloud Storage isn’t supported for Posit application-specific workloads, but it can be used as a data source for developers and deployed content.
Frequently asked questions
Can I standardize the type of storage used for Workbench, Connect, and Package Manager?
Yes. Generally if your storage choice is viable for Workbench, it’s a good fit for the rest Posit Team. The trade-off for the simplicity of only having to manage one storage solution, is the potential to overspend on storage that’s more performant and costly than necessary for Connect and Package Manager.