Autoscaling considerations

If you run Posit Connect with off-host execution and your Kubernetes cluster uses Karpenter to autoscale nodes, configure server and content pods to avoid disruption when Karpenter removes nodes.

Overview

Scaling up requires no Connect-specific configuration. The concerns on this page apply to scale-down only.

Kubernetes cluster autoscalers remove nodes from your cluster when those nodes appear underutilized. The principal risk for Connect is server-pod eviction: if Karpenter evicts a Connect server pod, in-progress content jobs that the evicted server instance owns can become orphaned and fail to complete.

To avoid this, isolate Connect server pods in a dedicated node pool that the autoscaler does not scale down, using taints and tolerations to pin them there, and add an eviction-prevention annotation so the autoscaler does not voluntarily disrupt them. Route content pods to a separate autoscaling node pool that the autoscaler can scale up and down.

Known limitations

Even with the recommended configuration, the following limitations apply when running Connect in autoscaling clusters:

  1. Orphaned jobs on server eviction: If a node failure, drain, or pressure-based eviction removes a Connect server pod, the content jobs that the server launched become orphaned and never complete. Re-run affected renders interactively from the Connect UI or via the server API; scheduled jobs recover at their next scheduled run.

  2. Interactive content consolidation: The recommended configuration protects rendered content from voluntary Karpenter disruption but not interactive content such as Shiny and Streamlit applications. Adding an eviction-prevention annotation to interactive pods would prevent the content pool from scaling down whenever a user kept a session open, so the recommended configuration tolerates this disruption to keep the content pool elastic.

  3. Interactive session loss: When autoscaling evicts interactive content such as Shiny applications, users lose their sessions and per-session state and must reconnect.

  4. Minimum process pinning: Content configured with min_processes >= 1 keeps at least one pod running continuously, which can prevent the content pool from scaling to zero.

  5. No automatic environment restore retry: If an eviction interrupts environment restoration, Connect retries the restore only the next time the content is published, redeployed, or otherwise rebuilt.

  6. No automatic render retry: Connect does not automatically retry render pods that an involuntary eviction terminates. The RenderedContentAnnotation shown above prevents voluntary Karpenter disruption.

Testing and validation

After applying these values, validate the setup:

  1. Server pod placement: Confirm that the scheduler placed Connect server pods on the server pool:

    Terminal
    kubectl get pods -l app.kubernetes.io/name=rstudio-connect -o wide

    Cross-reference the NODE column against:

    Terminal
    kubectl get nodes -l workload=connect-server
  2. Server pod annotations: Confirm that the eviction-prevention annotation reached the pod:

    Terminal
    kubectl get pods -l app.kubernetes.io/name=rstudio-connect \
      -o jsonpath='{.items[*].metadata.annotations.karpenter\.sh/do-not-disrupt}'
  3. Content pod placement: Deploy a few content items, then confirm that content pods land on the content pool and spread across nodes:

    Terminal
    kubectl get pods -l app.kubernetes.io/name=rstudio_connect -o wide
  4. Voluntary disruption blocked for the server: With Karpenter installed, watch kubectl get nodeclaims -w while the cluster is otherwise idle. The server-pool NodeClaim should persist even when nodes become eligible for consolidation.

  5. Content pool scales down when idle: Stop all content, wait for consolidateAfter, and confirm that Karpenter removes the content-pool NodeClaim.

Additional resources