Integrating Posit Workbench with Spark and sparklyr#
sparklyr is an R interface for Apache Spark that allows you to install and connect to Spark, filter and aggregate datasets using dplyr syntax against Spark, then bring them into R for analysis and visualization.
You can install Posit Workbench, formerly RStudio Workbench[^1], within a Spark/Hadoop cluster and use sparklyr from R sessions.
The following articles describe how to integrate Workbench with a Spark cluster in different configurations:
- Using sparklyr with Cloudera CDH
- Using sparklyr with Amazon EMR
- Deployment and configuration options
Visit spark.rstudio.com for more information.