Posit Team
Overview
Posit Team is a sales bundle of Posit Workbench, Posit Connect, and Posit Package Manager software for developing data science projects, publishing data products, and managing packages.
This page includes instructions for downloading Posit professional products. Download and/or use of these products is governed under the terms of the Posit End User License Agreement. By downloading you agree to the terms posted there.
Evaluate Posit professional products
Try Posit Team
If you’re interested in trying the entire toolchain, please review the Posit Team page.
Additionally, you can install any of the Posit Professional Products in your own on-premises or cloud-based environment. For either trial or production, use any of the methods listed on the Installation of Posit Professional Products page.
Most of the installation methods start the products with a 45-day evaluation period in which you can use their full functionality. During or after the evaluation period, you can activate your installation of the products with a commercial license without having to reinstall or reconfigure the products.
Install Posit Team
Requirements
- Review and implement the product requirements.
- We recommend:
- Installing each product on a separate/dedicated server
- That each server has the required resources in place for each product
The exact order of steps listed depend on your needs and specific environment. However, Posit customers often find it easiest to perform the installation, configuration, and integration steps in the order below.
After completing the initial installation for each product, you can perform additional configurations and integrations to meet your specific needs related to security, authentication, data access, version control, and more.
Install on servers or virtual machines
Step 1. Install Posit professional products
- Install Workbench for data science development workflows
- Install Connect for publishing data science assets
- Install Package Manager for reproducibility and governance
Step 2. Perform additional configurations and integrations
- These configurations and integrations (e.g., TLS/SSL certificates, authentication, high availability, version control, data sources, resource managers, Python/Jupyter) are optional depending on your needs and your unique environment.
Step 3. Configure Posit professional products to work together
- Create a package repository in Package Manager
- Configure Workbench to use an Package Manager repository
- Configure Connect to use an Package Manager repository
- Configure Workbench to deploy to Connect by default
Install in a Kubernetes cluster
Configure and install using Helm charts
- Configure and install Workbench for data science development workflows
- Configure and install Connect for publishing data science assets
- Configure and install Package Manager for reproducibility and governance
Next steps
Learn about user workflows in Posit
Marketplace offerings
The Posit Cloud marketplace offerings allow you to purchase and use Posit professional products through your preferred cloud provider.
The Posit marketplace products are identical to the on-premises versions of the products. However, they’re preconfigured servers with Python, R, Posit Professional Drivers, and important packages configured for a turn-key experience unlike the on-premise versions.
AWS
The main AWS offering allows you to purchase Posit Team for usage in AWS:
The Posit Team offering is a way to purchase licenses on AWS. It allows you to use the other Bring Your Own License (BYOL) offerings below and Posit Workbench on Amazon SageMaker. When you purchase through this offering, it puts you in contact with Posit, who provides the licenses you need for use.
We also support custom purchases on AWS. Contact Posit sales to learn more.
The BYOL license offerings allow you to use existing licenses that you have purchased from Posit or from the Posit Team offering.
Azure
Multiple Azure offerings are all listed together on a page for each product. Select a software plan on the page to see your options.
Posit Workbench and Posit Connect have offerings that allow you to purchase specific numbers of users for different editions. Only the specific combinations of users and editions listed are available for purchase on Azure. Contact Posit sales for plan options specific to your organization’s needs. Once you purchase through Azure, you can use the image from the offering in your account without any more licensing.
The offerings also include BYOL offerings that you can learn how to use.
Google Cloud Platform (GCP)
Licensed versions of Post Workbench and Posit Connect are available for purchase on the GCP Marketplace.
- Posit Workbench (Basic Edition)
- Posit Connect (Basic Edition)
- Posit Connect (Enhanced Edition)
- Posit Connect (Advanced Edition)
Posit Workbench and Posit Connect have offerings that allow you to purchase specific numbers of users for different editions. You can only purchase the listed specific combinations of users and editions on GCP. Contact Posit sales for plan options specific to your organization’s needs. Once you purchase through GCP, you can use the image from the offering in your account without any more licensing.
The Bring Your Own License (BYOL) offerings allow you to use existing licenses that you have purchased from Posit or from the Posit Team offering
Bring your own license offerings
The Bring Your Own License (BYOL) offerings are available on AWS, Azure, and GCP for Posit Workbench, Posit Connect, and Posit Package Manager. These offerings don’t cost anything to use outside of normal infrastructure costs and expect you to have an existing license. Once you have created a server from one of these offerings, you can add your license the same way you would for on-premises servers. See the licensing documentation for more details.
Technical details on offerings
All images within the offerings are based on Ubuntu 22.04 LTS. They include multiple versions of Python and R to allow for different needs. The installations of the Posit Team products and Posit Professional Drivers use the standard installations for Ubuntu that are available to customers. New images are created after product releases that include a variety of upgrades, but you can also upgrade anything you need in place as well.
Removed limitations
Limitations on upgrading servers for marketplace offerings no longer exist.
Additionally, AWS offerings used to require the custom Identity and Access Management (IAM) settings and custom license manager upgrades, both of which are no longer needed.
Integrations
Before continuing, review the Storage documentation as it likely supersedes the content listed in this section.
This section explains how to set up Amazon EFS with Posit Team and details best practices for ongoing usage of EFS. Additionally, this document covers how we came up with these recommendations.
The Amazon Elastic File System (EFS) has unique design characteristics that can make it challenging to use with Posit Team. To be successful with EFS, please be sure to read this entire section and adhere to its guidance.
Amazon EFS is a managed shared file system that scales elastically with the amount of storage you use. Since it supports Network File System (NFS), it’s possible to use EFS as the shared file system with Posit products. However, in some situations relevant to Posit Professional Products, EFS can suffer from slower performance relative to Elastic Block Store (EBS).
This slower performance is particularly prevalent in workloads that are sensitive to latency (as opposed to throughput). Specifically, EFS isn’t performant when reading and writing thousands of small files. When managing R workloads on a server, this can be problematic because some R packages that contain C++ code can contain a great many C++ header files. For example, the BH
package on CRAN contains ~12K header files, so installation of BH
can be slow on EFS.
On some rare occasions, this can also affect direct data science work in cases where the workflow requires reading many files. For example, performance could be poor when training a deep neural network that processes image files in bulk.
- Users interacting with EFS could experience slowness in certain operations. For additional information, see the Testing and benchmarking your configuration with
fsbench
section below. - Configure EFS with specific settings, as described below.
Product-specific limitations and recommendations
This section is relevant to Workbench, Connect, and Package Manager usage. We only have a few product-specific limitations and recommendations.
The default lock type of link-based won’t work; use the advisory type instead.
RStudio Workbench 1.4, or greater (including new versions of Posit Workbench)
Upgrade to version 2021-09.0We highly recommend upgrading to version 2021-09.0 because this release includes several performance improvements for EFS.
- If you choose to configure
Database.Dir
, this also must point to the same shared location.
- Use the
lookupcache=pos
mount option to prevent long service delays due to attribute caching. See the NFS documentation for more information.
Recommended EFS configuration settings
The recommendations for the optimal EFS configuration and details on each of the settings are as follows
When using and configuring EFS:
- Use the general purpose performance mode, rather than Max I/O mode.
- For most cases, use bursting behavior rather than provisioned throughput.
- Single zone EFS performs much faster than multiple availability zone, but this should be a careful design choice.
- Install and use the Amazon package
efs-utils
to mount the file system, since this can sometimes have a significant positive impact. - If mounting with EFS Access Points, be cautious as this count mount all files with the same UID/GID, which is typically not desired.
- Your choice of EC2 instance type can can also affect performance. We recommend provisioning instances that are memory or compute optimized (don’t choose general purpose instances), and choose the network enhanced options, e.g.,
r5n.2xlarge
. - Use CloudWatch to monitor usage and identify system performance bottlenecks. In particular, watch out for situations where your burst credits ran out, since performance is dramatically worse without bursting available.
Special considerations
- Operations that write many small files (thousands or more) don’t perform well in most EFS settings.
- To prevent users from having to repeatedly install R package, Posit recommends preinstalling the R packages.
- Consider adopting code patterns that prefer reading large files over splitting data between many small files.
- When using an EFS file system for many users, we recommend segmenting data files into user-specific directories as much as is possible. Since writing large files blocks metadata operations in the same directory until the write operation is complete, keeping the users’ data isolated in separate directories minimizes the impact of large file operations on other users.
Since EFS performance is largely based on individual usage patterns, this document serves as a starting point rather than an absolute directive. Be aware that you need to tune your EFS configuration after monitoring user behavior, and it could require adjustments over time to ensure long-term performance.
Details on recommended EFS configuration settings
Max I/O vs. general purpose
When creating a file system, you must choose a performance mode which is permanent.
We strongly recommend using the GeneralPurpose
mode.
File systems in the Max I/O mode can scale to higher levels of aggregate throughput and operations per second. This scaling comes with a tradeoff of slightly higher latencies for file metadata operations. Highly parallelized applications and workloads, such as big data analysis, media processing, and genomic analysis, can benefit from this mode.
However, in Posit’s testing, we’ve found Max IO mode to offer significantly worse performance because of the increased latency, especially in the “many small files” scenario.
Bursting vs. provisioned throughput
EFS supports two throughput modes:
- Bursting
- Provisioned
From the AWS documentation page:
With Bursting Throughput mode, throughput on Amazon EFS scales as the size of your file system in the EFS Standard or One Zone storage class grows. With Provisioned Throughput mode, you can instantly provision the throughput of your file system (in MiB/s) independent of the amount of data stored.
The default bursting behavior is how most Posit customers should start using EFS, until you understand the characteristics of file access patterns and the costs they incur.
Monitor your Burst Credit balance and permitted throughput via CloudWatch to ensure you aren’t surprised by throttling if you run out of burst credits. We highly recommend setting alarms based on these metrics.
Remedy throttling by either generating more Burst Credits (writing files to the file system or waiting for the Burst Credits to refresh) or converting to Provisioned Throughput mode. Large file systems (> 1TB) should theoretically be able to burst for 50% of the time. For smaller file systems, set Provisioned Throughput to maintain a constant performance level.
Generating large files to bump into a larger tier of burst performance is both time consuming and expensive. Ensure that you consider these options. For example, creating 1TB of data could cost hundreds of dollars per month in storage costs.
If migrating to EFS, Provisioned Throughput can help save time to move a lot of data. In testing, moving from Bursting to 500MiB Provisioned improved speed by 5x and preserved Burst Credits.
If you choose One Zone Storage (see below), differences between bursting and provisioned performance appeared to be minimal.
Multiple Availability Zone (default) vs. One Zone storage classes
AWS recently introduced Single Availability Zone (AZ) EFS with a different SLA than Multiple AZ instances. As of June 2021, Multiple AZ EFS instances support 99.99% uptime and Single AZ instances support 99.9% uptime. The Single AZ EFS is still durable, but there is no failover if the entire availability zone goes down. Most organizations adopting EFS to load balance Posit professional products prefers the Multiple AZ SLA. However, the Single AZ EFS performance is substantially better than Multiple AZ EFS, such that in testing, its performance was comparable to a custom-configured NFS server.
This makes the Single AZ EFS server a strong option when configuring development environments
Read more about storage classes.
Use efs-utils
when mounting the file system
We strongly recommend using efs-utils
to mount the EFS file system. If this isn’t feasible, standard NFS client connections are possible, but there are mounting instructions and additional considerations to consider.
Setting the read_ahead_kb
size to 15 MB
From the AWS performance tips page:
Linux kernels (5.4.*) use a
read_ahead_kb
of 128 KB, however the AWS docs recommend 15 MB
The efs-utils package should set this correctly, but we recommend checking this value regardless to ensure that it’s set to 15 MB as expected. Customers who wish to use only standard NFS utilities need to set this value manually.
EC2 instance types
In general for EFS, AWS recommends preferring instance types with more CPU or memory depending on the workload. Prefer memory-optimized or compute-optimized over general purpose instance types.
In Posit benchmarking, we’ve observed performance gains by using memory-optimized instance types, e.g., r5
. For UI-related tasks like installing the BH
package, this could provide a better user experience.
For server installations that use many NFS client connections (e.g., Launcher), the enhanced networking might prove to be noticeably better. Consider using the n
variants, e.g., r5n
.
EC2 instance sizes
We’ve observed significant gains in going from large
to xlarge
instance sizes, primarily in parallelized load. For servers with many users, Posit recommends increasing the instance size.
Don’t attempt to use smaller instance types, e.g., c5.large
with 4GB memory.
Monitoring usage
The best way to monitor EFS performance is to configure AWS CloudWatch. Please refer to the available CloudWatch metrics and metric math for EFS resources for more information.
If using Bursting mode:
- Be sure to monitor the
BurstCreditBalance
metric. If this begins to decrease substantially over time, consider adding data to bump the file system size into a larger tier with more burst credits, or moving to Provisioned Throughput to establish a consistent baseline. This is likely to incur extra cost, so please consider the tradeoffs before proceeding. - Using metric math, you can compare
MeteredIOBytes
toPermittedThroughput
to know if you are using all your available throughput. If you are, it might be an indication that you should move to Provisioned Throughput.
If using Provisioned Throughput:
- Use
PermittedThroughput
to determine whether your storage volume has bumped you above your designated throughput setting.
Testing and benchmarking your configuration with fsbench
If you want to collect data about file system performance on your own EFS installation, use Posit’s benchmarking tool. The benchmarking tool runs a suite of file operations to help characterize system behavior and compare it against other known configurations.
For information on how to configure and run benchmark testing, please refer to the fsbench package documentation.