Using Amazon Elastic File System (EFS) with Posit Team
The Amazon Elastic File System (EFS) has unique design characteristics that can make it challenging to use with Posit Team. To be successful with EFS, please be sure to read through this full document and adhere to its guidance.
Overview
This document explains how to set up Amazon EFS with Posit Team and details best practices for ongoing usage of EFS. Additionally, this document covers how we came up with these recommendations.
Amazon EFS is a managed shared file system that scales elastically with the amount of storage you use. Since it supports Network File System (NFS), it’s possible to use EFS as the shared file system with Posit products. However, in some situations relevant to Posit Professional Products, EFS can suffer from slower performance relative to Elastic Block Store (EBS).
This slower performance is particularly prevalent in workloads that are sensitive to latency (as opposed to throughput). Specifically, EFS isn’t performant when reading and writing thousands of small files. When managing R workloads on a server, this can be problematic because some R packages that contain C++ code can contain a great many C++ header files. For example, the BH
package on CRAN contains ~12K header files, so installation of BH
can be slow on EFS.
On some rare occasions, this can also affect direct data science work in cases where the workflow requires reading many files. For example, performance could be poor when training a deep neural network that processes image files in bulk.
- Users interacting with EFS could experience slowness in certain operations. For additional information, see the Testing and benchmarking your configuration with
fsbench
section below. - Configure EFS with specific settings, as described below.
Product-specific limitations and recommendations
This document is relevant to Workbench, Connect, and Package Manager usage. We only have a few product-specific limitations and recommendations.
For Workbench:
The default lock type of link-based won’t work; use the advisory type instead.
RStudio Workbench 1.4, or greater (including new versions of Posit Workbench)
Upgrade to version 2021-09.0We highly recommend upgrading to version 2021-09.0 because this release includes several performance improvements for EFS.
For Connect:
- If you choose to configure
Database.Dir
, this also must point to the same shared location.
For Package Manager:
- Use the
lookupcache=pos
mount option to prevent long service delays due to attribute caching. See the NFS documentation for more information.
Recommended EFS configuration settings
The recommendations for the optimal EFS configuration and details on each of the settings are as follows
When using and configuring EFS:
Use the general purpose performance mode, rather than Max I/O mode.
For most cases, use bursting behavior rather than provisioned throughput.
Single zone EFS performs much faster than multiple availability zone, but this should be a careful design choice.
Install and use the Amazon package
efs-utils
to mount the file system, since this can sometimes have a significant positive impact.If mounting with EFS Access Points, be cautious as this count mount all files with the same UID/GID, which is typically not desired.
Your choice of EC2 instance type can can also affect performance. We recommend provisioning instances that are memory or compute optimized (don’t choose general purpose instances), and choose the network enhanced options, e.g.,
r5n.2xlarge
.Use CloudWatch to monitor usage and identify system performance bottlenecks. In particular, watch out for situations where your burst credits ran out, since performance is dramatically worse without bursting available.
Special considerations
Operations that write many small files (thousands or more) don’t perform well in most EFS settings.
To prevent users from having to repeatedly install R package, Posit recommends preinstalling the R packages.
Consider adopting code patterns that prefer reading large files over splitting data between many small files.
When using an EFS file system for many users, we recommend segmenting data files into user-specific directories as much as is possible. Since writing large files blocks metadata operations in the same directory until the write operation is complete, keeping the users’ data isolated in separate directories minimizes the impact of large file operations on other users.
Since EFS performance is largely based on individual usage patterns, this document serves as a starting point rather than an absolute directive. Be aware that you need to tune your EFS configuration after monitoring user behavior, and it could require adjustments over time to ensure long-term performance.
Details on recommended EFS configuration settings
Max I/O vs. general purpose
When creating a file system, you must choose a performance mode which is permanent.
We strongly recommend using the GeneralPurpose
mode.
File systems in the Max I/O mode can scale to higher levels of aggregate throughput and operations per second. This scaling comes with a tradeoff of slightly higher latencies for file metadata operations. Highly parallelized applications and workloads, such as big data analysis, media processing, and genomic analysis, can benefit from this mode.
However, in Posit’s testing, we’ve found Max IO mode to offer significantly worse performance because of the increased latency, especially in the “many small files” scenario.
Bursting vs. provisioned throughput
EFS supports two throughput modes:
- Bursting
- Provisioned
From the AWS documentation page:
With Bursting Throughput mode, throughput on Amazon EFS scales as the size of your file system in the EFS Standard or One Zone storage class grows. With Provisioned Throughput mode, you can instantly provision the throughput of your file system (in MiB/s) independent of the amount of data stored.
The default bursting behavior is how most Posit customers should start using EFS, until you understand the characteristics of file access patterns and the costs they incur.
Monitor your Burst Credit balance and permitted throughput via CloudWatch to ensure you aren’t surprised by throttling if you run out of burst credits. We highly recommend setting alarms based on these metrics.
Remedy throttling by either generating more Burst Credits (writing files to the file system or waiting for the Burst Credits to refresh) or converting to Provisioned Throughput mode. Large file systems (> 1TB) should theoretically be able to burst for 50% of the time. For smaller file systems, set Provisioned Throughput to maintain a constant performance level.
Generating large files to bump into a larger tier of burst performance is both time consuming and expensive. Ensure that you consider these options. For example, creating 1TB of data could cost hundreds of dollars per month in storage costs.
If migrating to EFS, Provisioned Throughput can help save time to move a lot of data. In testing, moving from Bursting to 500MiB Provisioned improved speed by 5x and preserved Burst Credits.
If you choose One Zone Storage (see below), differences between bursting and provisioned performance appeared to be minimal.
Multiple Availability Zone (default) vs. One Zone storage classes
AWS recently introduced Single Availability Zone (AZ) EFS with a different SLA than Multiple AZ instances. As of June 2021, Multiple AZ EFS instances support 99.99% uptime and Single AZ instances support 99.9% uptime. The Single AZ EFS is still durable, but there is no failover if the entire availability zone goes down. Most organizations adopting EFS to load balance Posit professional products prefers the Multiple AZ SLA. However, the Single AZ EFS performance is substantially better than Multiple AZ EFS, such that in testing, its performance was comparable to a custom-configured NFS server.
This makes the Single AZ EFS server a strong option when configuring development environments
Read more about storage classes.
Use efs-utils
when mounting the file system
We strongly recommend using efs-utils
to mount the EFS file system. If this isn’t feasible, standard NFS client connections are possible, but there are mounting instructions and additional considerations to consider.
Setting the read_ahead_kb
size to 15 MB
From the AWS performance tips page:
Linux kernels (5.4.*) use a
read_ahead_kb
of 128 KB, however the AWS docs recommend 15 MB
The efs-utils package should set this correctly, but we recommend checking this value regardless to ensure that it’s set to 15 MB as expected. Customers who wish to use only standard NFS utilities need to set this value manually.
EC2 instance types
In general for EFS, AWS recommends preferring instance types with more CPU or memory depending on the workload. Prefer memory-optimized or compute-optimized over general purpose instance types.
In Posit benchmarking, we’ve observed performance gains by using memory-optimized instance types, e.g., r5
. For UI-related tasks like installing the BH
package, this could provide a better user experience.
For server installations that use many NFS client connections (e.g., Launcher), the enhanced networking might prove to be noticeably better. Consider using the n
variants, e.g., r5n
.
EC2 instance sizes
We’ve observed significant gains in going from large
to xlarge
instance sizes, primarily in parallelized load. For servers with many users, Posit recommends increasing the instance size.
Don’t attempt to use smaller instance types, e.g., c5.large
with 4GB memory.
Monitoring usage
The best way to monitor EFS performance is to configure AWS CloudWatch. Please refer to the available CloudWatch metrics and metric math for EFS resources for more information.
If using Bursting mode:
Be sure to monitor the
BurstCreditBalance
metric. If this begins to decrease substantially over time, consider adding data to bump the file system size into a larger tier with more burst credits, or moving to Provisioned Throughput to establish a consistent baseline. This is likely to incur extra cost, so please consider the tradeoffs before proceeding.Using metric math, you can compare
MeteredIOBytes
toPermittedThroughput
to know if you are using all your available throughput. If you are, it might be an indication that you should move to Provisioned Throughput.
If using Provisioned Throughput:
- Use
PermittedThroughput
to determine whether your storage volume has bumped you above your designated throughput setting.
Testing and benchmarking your configuration with fsbench
If you want to collect data about file system performance on your own EFS installation, use Posit’s benchmarking tool. The benchmarking tool runs a suite of file operations to help characterize system behavior and compare it against other known configurations.
For information on how to configure and run benchmark testing, please refer to the fsbench package documentation.