Capacity planning - content with non-default process settings
Problem
You want to understand potential resource outliers and review capacity planning decisions.
When a content item’s process settings are null, it uses the server-level defaults from the [Scheduler] configuration section in Posit Connect. Non-null values indicate that a publisher or administrator has explicitly overridden the defaults for that content item.
Solution
Retrieve all content from the server, filter for items with non-default process settings (max_processes, min_processes, max_conns_per_process, load_factor, or idle_timeout), and cross-reference with usage data to assess whether the overrides are justified.
from posit import connect
import polars as pl
client = connect.Client()
# Retrieve all content items (include owner details)
all_content = client.content.find(include="owner")
content_df = pl.DataFrame(all_content, infer_schema_length=None)
# Define the process settings columns to check
process_settings = [
"max_processes",
"min_processes",
"max_conns_per_process",
"load_factor",
"idle_timeout",
]
# Filter to content with at least one non-null process setting
content_with_overrides = (
content_df
.filter(
pl.any_horizontal([pl.col(col).is_not_null() for col in process_settings])
)
.with_columns(
pl.col("owner")
.map_elements(lambda x: x["username"], return_dtype=pl.String)
.alias("owner_username")
)
.select(
["guid", "title", "owner_username", "app_mode"]
+ process_settings
+ ["dashboard_url"]
)
)The resulting table lists all content with at least one customized process setting.
>>> content_with_overrides
shape: (4, 10)
┌──────────────────┬───────────────────┬────────────────┬──────────┬───────────────┬───────────────┬───────────────────────┬─────────────┬──────────────┬──────────────────┐
│ guid ┆ title ┆ owner_username ┆ app_mode ┆ max_processes ┆ min_processes ┆ max_conns_per_process ┆ load_factor ┆ idle_timeout ┆ dashboard_url │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ i64 ┆ i64 ┆ i64 ┆ f64 ┆ i64 ┆ str │
╞══════════════════╪═══════════════════╪════════════════╪══════════╪═══════════════╪═══════════════╪═══════════════════════╪═════════════╪══════════════╪══════════════════╡
│ 5258049f-fe5e-… ┆ Sales Dashboard ┆ publisher1 ┆ shiny ┆ 10 ┆ 2 ┆ null ┆ null ┆ null ┆ https://connect… │
│ 11471207-1059-… ┆ Forecast API ┆ publisher2 ┆ python- ┆ 5 ┆ 1 ┆ 50 ┆ 0.5 ┆ null ┆ https://connect… │
│ ┆ ┆ ┆ api ┆ ┆ ┆ ┆ ┆ ┆ │
│ deec1ee8-3f14-… ┆ Risk Model ┆ publisher1 ┆ python- ┆ null ┆ null ┆ null ┆ null ┆ 3600 ┆ https://connect… │
│ ┆ ┆ ┆ api ┆ ┆ ┆ ┆ ┆ ┆ │
│ a1b2c3d4-e5f6-… ┆ Analytics App ┆ publisher3 ┆ shiny ┆ 20 ┆ 5 ┆ 20 ┆ 0.8 ┆ 900 ┆ https://connect… │
└──────────────────┴───────────────────┴────────────────┴──────────┴───────────────┴───────────────┴───────────────────────┴─────────────┴──────────────┴──────────────────┘Cross-referencing with usage data
To assess whether the overrides are justified, cross-reference with content usage data. Rather than fetching all usage data from the server, query usage only for the content items that have overrides.
from datetime import datetime, timedelta, timezone
# Retrieve usage data from the last 90 days, only for content with overrides
since = datetime.now(timezone.utc) - timedelta(days=90)
override_guids = content_with_overrides["guid"].to_list()
usage_records = []
for guid in override_guids:
records = client.metrics.usage.find(
content_guid=guid, start=since.isoformat()
)
usage_records.extend(records)
usage_df = pl.DataFrame(usage_records, infer_schema_length=None)
# Count usage events per content item
usage_counts = (
usage_df
.group_by("content_guid")
.agg(pl.len().alias("usage_count"))
)
# Join usage data with the overrides table
overrides_with_usage = (
content_with_overrides
.join(usage_counts, left_on="guid", right_on="content_guid", how="left")
.with_columns(pl.col("usage_count").fill_null(0))
.sort("usage_count", descending=True)
)The resulting table shows which content items have process overrides alongside their usage volume.
>>> overrides_with_usage.select(["title", "max_processes", "min_processes", "usage_count"])
shape: (4, 4)
┌───────────────────┬───────────────┬───────────────┬─────────────┐
│ title ┆ max_processes ┆ min_processes ┆ usage_count │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ u32 │
╞═══════════════════╪═══════════════╪═══════════════╪═════════════╡
│ Sales Dashboard ┆ 10 ┆ 2 ┆ 5230 │
│ Forecast API ┆ 5 ┆ 1 ┆ 1843 │
│ Analytics App ┆ 20 ┆ 5 ┆ 42 │
│ Risk Model ┆ null ┆ null ┆ 0 │
└───────────────────┴───────────────┴───────────────┴─────────────┘Full example
from posit import connect
from datetime import datetime, timedelta, timezone
import polars as pl
client = connect.Client()
# Retrieve all content items (include owner details)
all_content = client.content.find(include="owner")
content_df = pl.DataFrame(all_content, infer_schema_length=None)
# Define the process settings columns to check
process_settings = [
"max_processes",
"min_processes",
"max_conns_per_process",
"load_factor",
"idle_timeout",
]
# Filter to content with at least one non-null process setting
content_with_overrides = (
content_df
.filter(
pl.any_horizontal([pl.col(col).is_not_null() for col in process_settings])
)
.with_columns(
pl.col("owner")
.map_elements(lambda x: x["username"], return_dtype=pl.String)
.alias("owner_username")
)
.select(
["guid", "title", "owner_username", "app_mode"]
+ process_settings
+ ["dashboard_url"]
)
)
# Retrieve usage data from the last 90 days, only for content with overrides
since = datetime.now(timezone.utc) - timedelta(days=90)
override_guids = content_with_overrides["guid"].to_list()
usage_records = []
for guid in override_guids:
records = client.metrics.usage.find(
content_guid=guid, start=since.isoformat()
)
usage_records.extend(records)
usage_df = pl.DataFrame(usage_records, infer_schema_length=None)
# Count usage events per content item
usage_counts = (
usage_df
.group_by("content_guid")
.agg(pl.len().alias("usage_count"))
)
# Join usage data with the overrides table
overrides_with_usage = (
content_with_overrides
.join(usage_counts, left_on="guid", right_on="content_guid", how="left")
.with_columns(pl.col("usage_count").fill_null(0))
.sort("usage_count", descending=True)
)library(connectapi)
library(dplyr)
library(tidyr)
client <- connect()
# Retrieve all content items
content_df <- get_content(client)
# Define the process settings columns to check
process_settings <- c(
"max_processes",
"min_processes",
"max_conns_per_process",
"load_factor",
"idle_timeout"
)
# Filter to content with at least one non-null process setting
content_with_overrides <- content_df |>
filter(if_any(all_of(process_settings), \(x) !is.na(x))) |>
hoist(owner, owner_username = "username") |>
select(guid, title, owner_username, app_mode,
all_of(process_settings), dashboard_url)The resulting table lists all content with at least one customized process setting.
> content_with_overrides
# A tibble: 4 x 10
guid title owner_username app_mode max_processes min_processes max_conns_per_process load_factor idle_timeout dashboard_url
<chr> <chr> <chr> <chr> <int> <int> <int> <dbl> <int> <chr>
1 5258049f-fe5e-… Sales Dashboard publisher1 shiny 10 2 NA NA NA https://connect…
2 11471207-1059-… Forecast API publisher2 python-api 5 1 50 0.5 NA https://connect…
3 deec1ee8-3f14-… Risk Model publisher1 python-api NA NA NA NA 3600 https://connect…
4 a1b2c3d4-e5f6-… Analytics App publisher3 shiny 20 5 20 0.8 900 https://connect…Cross-referencing with usage data
To assess whether the overrides are justified, cross-reference with content usage data. Rather than fetching all usage data from the server, query usage only for the content items that have overrides.
# Retrieve usage data from the last 90 days, only for content with overrides
since <- Sys.time() - as.difftime(90, units = "days")
override_guids <- content_with_overrides$guid
usage <- purrr::map(override_guids, \(guid) {
shiny <- get_usage_shiny(client, content_guid = guid, from = since,
limit = Inf)
static <- get_usage_static(client, content_guid = guid, from = since,
limit = Inf) |>
rename(started = time)
bind_rows(shiny, static)
}) |>
list_rbind()
# Count usage events per content item
usage_counts <- usage |>
count(content_guid, name = "usage_count")
# Join usage data with the overrides table
overrides_with_usage <- content_with_overrides |>
left_join(usage_counts, by = c("guid" = "content_guid")) |>
mutate(usage_count = replace_na(usage_count, 0)) |>
arrange(desc(usage_count))The resulting table shows which content items have process overrides alongside their usage volume.
> overrides_with_usage |> select(title, max_processes, min_processes, usage_count)
# A tibble: 4 x 4
title max_processes min_processes usage_count
<chr> <int> <int> <int>
1 Sales Dashboard 10 2 5230
2 Forecast API 5 1 1843
3 Analytics App 20 5 42
4 Risk Model NA NA 0Full example
library(connectapi)
library(dplyr)
library(tidyr)
library(purrr)
client <- connect()
# Retrieve all content items
content_df <- get_content(client)
# Define the process settings columns to check
process_settings <- c(
"max_processes",
"min_processes",
"max_conns_per_process",
"load_factor",
"idle_timeout"
)
# Filter to content with at least one non-null process setting
content_with_overrides <- content_df |>
filter(if_any(all_of(process_settings), \(x) !is.na(x))) |>
hoist(owner, owner_username = "username") |>
select(guid, title, owner_username, app_mode,
all_of(process_settings), dashboard_url)
# Retrieve usage data from the last 90 days, only for content with overrides
since <- Sys.time() - as.difftime(90, units = "days")
override_guids <- content_with_overrides$guid
usage <- map(override_guids, \(guid) {
shiny <- get_usage_shiny(client, content_guid = guid, from = since,
limit = Inf)
static <- get_usage_static(client, content_guid = guid, from = since,
limit = Inf) |>
rename(started = time)
bind_rows(shiny, static)
}) |>
list_rbind()
# Count usage events per content item
usage_counts <- usage |>
count(content_guid, name = "usage_count")
# Join usage data with the overrides table
overrides_with_usage <- content_with_overrides |>
left_join(usage_counts, by = c("guid" = "content_guid")) |>
mutate(usage_count = replace_na(usage_count, 0)) |>
arrange(desc(usage_count))Discussion
Content items with high max_processes or min_processes values can consume significant server resources, especially when usage is low. The cross-reference with usage data helps you identify potential mismatches:
- High
max_processeswith lowmax_conns_per_process: You might have over-provisioned the content. Consider load testing to determine whether the settings are justified. - High
min_processeswith low usage: Idle processes consume memory even when no users are active. Consider loweringmin_processesor removing the override entirely. idle_timeoutset very high: Content processes stay alive longer, consuming resources during idle periods.- Low
load_factor: Spawns new processes aggressively, which may result in more concurrent processes than necessary.
Content with null values for all process settings uses the server-level defaults defined in the [Scheduler] section of the Connect configuration file.
See also
- Viewing Content Runtime Settings to create a table summarizing the overall usage of custom runtime settings.
- Finding Content with Custom Runtime Settings to find content with customized
RunAsorRunAsCurrentUsersettings. - Viewing Content Usage Information for more details on retrieving usage data.
- Process management in the Admin Guide for details on how Connect manages content processes.