Autoscaling considerations
If you run Posit Connect with off-host execution and your Kubernetes cluster uses Karpenter to autoscale nodes, configure server and content pods to avoid disruption when Karpenter removes nodes.
Overview
Scaling up requires no Connect-specific configuration. The concerns on this page apply to scale-down only.
Kubernetes cluster autoscalers remove nodes from your cluster when those nodes appear underutilized. The principal risk for Connect is server-pod eviction: if Karpenter evicts a Connect server pod, in-progress content jobs that the evicted server instance owns can become orphaned and fail to complete.
To avoid this, isolate Connect server pods in a dedicated node pool that the autoscaler does not scale down, using taints and tolerations to pin them there, and add an eviction-prevention annotation so the autoscaler does not voluntarily disrupt them. Route content pods to a separate autoscaling node pool that the autoscaler can scale up and down.
Recommended architecture
The recommended architecture for running Connect in autoscaling clusters is:
- Run Connect server pods in a dedicated node pool that the autoscaler does not voluntarily disrupt, and add an eviction-prevention annotation as defense-in-depth.
- Run content pods in a separate autoscaling node pool, and distribute them across nodes using topology spread constraints.
- Keep the two node pools disjoint by tainting the server pool and assigning a
nodeSelectorto content pods.
Configuring the cluster
Create two node pools:
- A server pool labeled
workload=connect-serverand tainted withworkload=connect-server:NoSchedule. Disable consolidation, or configure it to only consolidate empty nodes. - A content pool labeled
workload=connect-content, with no taint, sized to autoscale based on content workload.
The following Karpenter manifest provisions the server pool on AWS EKS. Adapt the nodeClassRef for your cloud provider.
nodepool-connect-server.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: connect-server
spec:
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 1m
template:
metadata:
labels:
workload: connect-server
spec:
taints:
- key: workload
value: connect-server
effect: NoSchedule
nodeClassRef:
name: default
kind: EC2NodeClass
group: karpenter.k8s.awsnodepool-connect-content.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: connect-content
spec:
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s
template:
metadata:
labels:
workload: connect-content
spec:
nodeClassRef:
name: default
kind: EC2NodeClass
group: karpenter.k8s.awsConfiguring Connect
Apply the following overlay to your Helm values.yaml. The server fields configure the Connect server Deployment; the backends.kubernetes.defaultResourceJobBase block configures every content Job that Connect submits.
values.yaml
# Server pod: pin to the dedicated server pool.
tolerations:
- key: workload
operator: Equal
value: connect-server
effect: NoSchedule
nodeSelector:
workload: connect-server
pod:
annotations:
karpenter.sh/do-not-disrupt: "true"
# Content pods: pin every Connect-launched Job to the content pool and
# spread them across nodes.
backends:
kubernetes:
enabled: true
defaultResourceJobBase:
spec:
template:
spec:
nodeSelector:
workload: connect-content
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
# ScheduleAnyway lets content launch even when the content
# pool has only one node available; Karpenter scales up after.
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: rstudio_connect
# Apply do-not-disrupt to render jobs only. Interactive content
# remains subject to voluntary Karpenter disruption.
config:
Kubernetes:
RenderedContentAnnotation:
- "karpenter.sh/do-not-disrupt=true"The karpenter.sh/do-not-disrupt annotation blocks voluntary disruption such as consolidation, drift, and expiration. The configuration above applies it to the Connect server pod through pod.annotations and to rendered content jobs through config.Kubernetes.RenderedContentAnnotation. The configuration does not annotate interactive content pods, so Karpenter can still voluntarily disrupt them. The annotation does not protect against involuntary disruption from node failure, manual kubectl drain, or pressure-based eviction. See the Karpenter disruption documentation for the full disruption model.
Known limitations
Even with the recommended configuration, the following limitations apply when running Connect in autoscaling clusters:
Orphaned jobs on server eviction: If a node failure, drain, or pressure-based eviction removes a Connect server pod, the content jobs that the server launched become orphaned and never complete. Re-run affected renders interactively from the Connect UI or via the server API; scheduled jobs recover at their next scheduled run.
Interactive content consolidation: The recommended configuration protects rendered content from voluntary Karpenter disruption but not interactive content such as Shiny and Streamlit applications. Adding an eviction-prevention annotation to interactive pods would prevent the content pool from scaling down whenever a user kept a session open, so the recommended configuration tolerates this disruption to keep the content pool elastic.
Interactive session loss: When autoscaling evicts interactive content such as Shiny applications, users lose their sessions and per-session state and must reconnect.
Minimum process pinning: Content configured with
min_processes >= 1keeps at least one pod running continuously, which can prevent the content pool from scaling to zero.No automatic environment restore retry: If an eviction interrupts environment restoration, Connect retries the restore only the next time the content is published, redeployed, or otherwise rebuilt.
No automatic render retry: Connect does not automatically retry render pods that an involuntary eviction terminates. The
RenderedContentAnnotationshown above prevents voluntary Karpenter disruption.
Testing and validation
After applying these values, validate the setup:
Server pod placement: Confirm that the scheduler placed Connect server pods on the server pool:
Terminal
kubectl get pods -l app.kubernetes.io/name=rstudio-connect -o wideCross-reference the
NODEcolumn against:Terminal
kubectl get nodes -l workload=connect-serverServer pod annotations: Confirm that the eviction-prevention annotation reached the pod:
Terminal
kubectl get pods -l app.kubernetes.io/name=rstudio-connect \ -o jsonpath='{.items[*].metadata.annotations.karpenter\.sh/do-not-disrupt}'Content pod placement: Deploy a few content items, then confirm that content pods land on the content pool and spread across nodes:
Terminal
kubectl get pods -l app.kubernetes.io/name=rstudio_connect -o wideVoluntary disruption blocked for the server: With Karpenter installed, watch
kubectl get nodeclaims -wwhile the cluster is otherwise idle. The server-pool NodeClaim should persist even when nodes become eligible for consolidation.Content pool scales down when idle: Stop all content, wait for
consolidateAfter, and confirm that Karpenter removes the content-pool NodeClaim.