Connect request rejection guide

This guide provides metrics, query patterns, and investigation workflows for detecting and diagnosing request rejections in Posit Connect using OpenTelemetry signals.

Overview

This guide addresses the following operational question:

  • Are we rejecting users due to capacity? - Rejection rate monitoring, breakdown by reason and content type, and investigation workflows

Are we rejecting users due to capacity?

This question addresses whether Connect turns away users. Connect rejects requests when the server cannot serve them due to worker capacity exhaustion, license limits, or authentication/authorization failures.

Rejection signals to check

Total rejections

Question: Does Connect reject any requests?

Primary Metric: requests.rejected (Counter, unit: {request})

Dimensions:

  • rejection.reason: Why the request was rejected (capacity, license, auth)
  • rejection.type: What type of content or request was rejected (shiny, api, or a content type string)

Query Pattern:

sum:requests.rejected{*}.as_rate()
rate(requests_rejected_total[5m])
RATE(requests.rejected) over 5 minutes

Interpretation:

  • 0 = No rejections, Connect serves all requests
  • Any non-zero rate indicates Connect turns away users

Usage: Display as a time-series graph. Check this signal first when investigating whether users experience access problems. Any sustained non-zero rate warrants further investigation by reason and type.


Rejections by reason

Question: Why does Connect reject requests?

Primary Metric: requests.rejected (Counter, unit: {request})

Dimensions:

  • rejection.reason: Reason for rejection

Query Pattern:

sum:requests.rejected{*} by {rejection.reason}.as_rate()
sum by (rejection_reason) (rate(requests_rejected_total[5m]))
RATE(requests.rejected) GROUP BY rejection.reason over 5 minutes

Interpretation:

The rejection.reason dimension has three possible values:

  • capacity - Worker or application capacity has been reached. Connect has no available workers to handle the request. This is the most common rejection reason during periods of high usage.
  • license - A license limit has been exceeded. Either the named user limit or the concurrent Shiny user limit has been reached.
  • auth - Authentication or authorization failure. Connect rejected the request due to invalid credentials, locked accounts, insufficient permissions, or group-based login restrictions.

Usage: Group by rejection.reason to understand the primary cause of rejections. Each reason points to a different remediation path.


Rejections by content type

Question: What types of content or requests does Connect reject?

Primary Metric: requests.rejected (Counter, unit: {request})

Dimensions:

  • rejection.type: Type of content or request

Query Pattern:

sum:requests.rejected{*} by {rejection.type}.as_rate()
sum by (rejection_type) (rate(requests_rejected_total[5m]))
RATE(requests.rejected) GROUP BY rejection.type over 5 minutes

Interpretation:

The rejection.type dimension indicates what was being accessed when the rejection occurred:

  • api - Connect rejected an API request (due to authentication failure or named user license limit)
  • shiny (with rejection.reason=license) - Connect rejected a Shiny application request because the concurrent Shiny user license limit was reached. This is distinct from capacity rejections: it means the server has hit its licensed limit for simultaneous Shiny users, not that worker processes are unavailable.
  • Content type strings (e.g., shiny, rmd-shiny, jupyter-voila, python-dash, etc., with rejection.reason=capacity) - Connect rejected a content request due to worker capacity exhaustion. The value corresponds to the content’s application mode. Note that a shiny type can appear with either a license or capacity reason — filter by rejection.reason to distinguish between the two.

Usage: Group by rejection.type to identify which content types are most affected by rejections. This helps prioritize capacity planning for specific workload types.


Rejection rate as percentage of total requests

Question: What fraction of requests does Connect reject?

Primary Metrics:

  • requests.rejected (Counter) - Total rejected requests
  • http.server.request.duration (Histogram) - Total HTTP requests served

Query Pattern:

sum:requests.rejected{*}.as_rate() / sum:http.server.request.duration.count{*}.as_rate() * 100
rate(requests_rejected_total[5m]) / rate(http_server_request_duration_seconds_count[5m]) * 100
RATE(requests.rejected) / RATE(COUNT(http.server.request.duration)) * 100

Interpretation:

  • < 1% = Occasional rejections, likely acceptable
  • 1-5% = Noticeable impact on users, investigate the cause
  • > 5% = Significant user impact, requires immediate attention

Usage: Display as a percentage gauge or time-series. This provides context for the raw rejection count: a handful of rejections during peak traffic may be acceptable, while the same count during low traffic indicates a more serious issue.


Debugging

When rejections are detected, follow this workflow to identify the cause and remediation:

  1. Identify which rejection reason is elevated: Check requests.rejected grouped by rejection.reason to determine whether the issue is capacity, licensing, or authentication.

  2. For capacity rejections:

    • Check worker.pool.utilization to confirm worker pools are saturated (values near 1.0)
    • Review worker.pool.busy vs worker.pool.size to see how many workers are available
    • Identify which content types are affected by grouping rejections by rejection.type
    • Consider scaling worker capacity or optimizing long-running content
    • See the job queue operations guide for worker pool monitoring details
  3. For license rejections:

    • Check license.users.current vs license.users.limit for named user utilization
    • Check license.shiny_users.current vs license.shiny_users.limit for Shiny user utilization
    • Determine whether the rejection type is shiny (concurrent Shiny user limit) or api (named user limit)
    • See the license capacity guide for detailed license monitoring
  4. For auth rejections:

    • Review authentication configuration and provider health
    • Check for locked user accounts
    • Verify group-based access restrictions are configured correctly
    • Review Connect server logs for specific authentication error details

Common causes and remediation

Capacity rejections:

  • Insufficient worker capacity - The number of available workers is too low for the current load. Increase Applications.MaxProcs or scale horizontally.
  • Long-running content - Applications or reports that hold workers for extended periods reduce availability for other users. Identify long-running content through traces and optimize or set appropriate timeouts.
  • Burst traffic - Sudden spikes in concurrent users exceed available capacity. Consider auto-scaling or load balancing strategies.

License rejections:

  • Approaching named user limit - The number of active users is near or at the licensed limit. Review user activity and consider upgrading the license.
  • Concurrent Shiny user limit - Too many users are simultaneously accessing Shiny applications. Stagger usage or upgrade the Shiny user limit in the license.

Authentication rejections:

  • Misconfigured authentication - Authentication provider changes or misconfigurations cause login failures. Verify provider settings and connectivity.
  • Expired credentials - Users with expired passwords or revoked access tokens. Review user account status and credential policies.
  • Group restrictions - Connect denies access to users not in the required groups. Verify group membership and login group configuration.