Skip to content


Sparklyr is an R interface for Apache Spark that allows you to:

  • Install and connect to Spark using YARN, Mesos, Livy or Kubernetes
  • Use dplyr to filter and aggregate Spark datasets and streams then bring them into R for analysis and visualization
  • Use MLlib, H2O, XGBoost and GraphFrames to train models at scale in Spark
  • Create interoperable machine learning pipelines and productionize them with MLeap
  • Create extensions that call the full Spark API or run distributed R code to support new functionality

Visit for more information.