AWS Simple Cloud Storage Service (S3)#
Package Manager can also utilize the AWS Simple Cloud Storage Service (S3) as a storage provider. This integration requires AWS credentials and updates to the Package Manager configuration file.
As a best practice, AWS recommends that you specify credentials in the following order:
- Use IAM roles for Amazon EC2 (if your application is running on an Amazon EC2 instance). IAM roles provide temporary security credentials to your instance to make AWS calls. IAM roles provide an easy way to distribute and manage credentials on multiple Amazon EC2 instances.
- Use a shared credentials file. This credentials file is the same one used by other SDKs and the AWS CLI. If you’re already using a shared credentials file, you can also use it for this purpose.
- Use environment variables. Setting environment variables is useful if you’re doing development work on a machine other than an Amazon EC2 instance.
If you select IAM roles for Amazon EC2 instances, Package Manager will automatically use the instance’s credentials.
See the AWS CLI Configuration for detailed documentation on configuring your environment for interaction with AWS.
The credentials Package Manager uses for S3 storage must have the following permissions for the bucket:
For testing with environment variables, create and edit a new file at
Environment="AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE" Environment="AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" Environment="AWS_DEFAULT_REGION=us-west-2"
Then, reload the
systemd process and restart the Package Manager service with:
On the Package Manager side, the
S3Storage sections must be updated. Here is a simple example using the bucket
my-s3-bucket, the region
us-east-1, and a shared configuration:
[Storage] ; Sets all storage classes to use S3 instead of the `DataDir` Default = s3 ; Default S3 settings. This is the minimum-required setting for using S3. [S3Storage] Bucket = my-s3-bucket Region = us-east-1 EnableSharedConfig = true
Users with advanced or specific needs can configure storage classes individually. For example, you could use this configuration if you only wanted to store internal R and CRAN packages in S3 and use local storage for everything else:
[Storage] Packages = s3 CRAN = s3 [S3Storage] Bucket = my-s3-bucket Region = us-east-1 EnableSharedConfig = true ; Override default S3 settings for the "packages" class. This demonstrates ; all the available S3 configuration settings. [S3Storage "packages"] Bucket = another-s3-bucket Prefix = rspm-packages Profile = dev-rspm Region = us-west-1 EnableSharedConfig = true
For more information on the storage classes, see the appendix.
While all application data including package and cache information can be stored in S3, Package Manager requires an encryption key available at boot to function properly. Preserving this key between instances is necessary for both high availability and ephemeral environments.
Note that the key is generated automatically and stored at the
Server.EncryptionKeyPath path. For ephmeral environments, or environments where an environment variable is preferred, the
PACKAGEMANAGER_ENCRYPTION_KEY value can also be used to store this key. See the environment variables section of the appendix for more information.
For users with strict security requirements, Package Manager supports client-side encryption with S3 and KMS. This setup requires a symmetric KMS key and additional credential permissions:
It also requires including the KMS Key ID in the Package Manager configuration file, for example:
[S3Storage] Bucket = my-s3-bucket Region = us-east-1 EnableSharedConfig = true KMSKeyID = XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
Package Manager uses Transport Layer Security (TLS) for all communication with S3. For customers who want additional security, we instead recommend using server-side encryption for Amazon S3 buckets.
Client-side encryption uses the Go implementation for AES/GCM. Due to this, objects to be encrypted or decrypted will be fully loaded into memory before encryption or decryption can occur. Users must allocate additional memory to avoid allocation failures. This will also result in slower upload and download speeds for clients.