Raw Metrics Chronicle Generates

Overview

Posit Chronicle gathers and processes metrics on a scheduled basis. This means you might not see metrics files immediately when first starting Chronicle. It also means there is a delay before the most recent data appears in the curated metrics files.

By default, Chronicle retrieves Prometheus metrics data once every 60 seconds and API-based metrics once every 20 minutes. Chronicle then aggregates and curates the metrics data once a day, shortly after 6 AM UTC.

For an exhaustive list of all the metrics Chronicle produces, see the following pages:

Data lifecycle management

Chronicle data accumulates over time. You can delete old data to free up storage space. The Chronicle data store (e.g., /var/lib/posit-chronicle/data) has several top-level directories.

  • Do not manipulate /private data directly. The Chronicle server manages this directory completely.
  • Do not read /hourly data directly. It is the largest volume of data. You can delete hourly data after a few days. This is optional and you can keep this data indefinitely if you like.
  • /daily data is relatively low-volume and represents “raw” data in Parquet format. Keep this data indefinitely unless available storage capacity is a major concern.
  • /curated data is low-volume and represents fully processed, report-ready data in Parquet format. Keep this data indefinitely.

Metrics details

This section provides technical details about the metrics Chronicle gathers and processes.

Common schema

Every metric shares a set of common column definitions, explained in the table below:

Attribute Type Description
timestamp date-time The time that this metric value was scraped.
host string The name of the machine where this metric value originated.
environment string The name of the environment specified in the Chronicle configuration.

Metric types

Each Chronicle metric includes a type:

Type Definition Example
gauge The value for these metrics records a measurement at a point in time. Can increase or decrease. The number of currently licensed active users on a Connect installation.
sum The total count of events over a given observation window. Can only increase. The total number of sessions launched on Workbench.
histogram A set of counts of events over a given observation window. Each count corresponds to the count of events within a specified limit. A set of counts of session startup times grouped by the startup duration (see detailed example below).
non-numeric These metrics do not have a numeric value, but are a collection of observed data at a point in time. The current version of Connect running on a given host.

Histogram example

Histogram values are grouped into buckets, each representing a range of durations. Each bucket has a limit, which marks the upper end of that range. The value for a bucket shows how many sessions had a duration less than or equal to that bucket’s limit and greater than the limit of the previous bucket.

For example, if Workbench reported these five session startup durations:

  • 8 seconds
  • 3 seconds
  • 42 seconds
  • 4 seconds
  • 325 seconds

The stored histogram bucket values would be:

Value Limit
0 0.0
0 1.0
2 5.0
1 10.0
0 30.0
1 60.0
0 300.0
1 Infinity

The row with limit 5.0 reports a value of 2, since both the 3 and 4 second durations from the example are less than or equal to 5.0 and greater than 1.0. The row with limit 10.0 reports a value of 1, which represents the 8 seconds duration from the example.

Aggregation strategies

Based on the type of metric represented in the data, Chronicle aggregates metrics according to one of the following strategies. Each individual metric type defined in this section includes the aggregation approach used to aggregate its data, or N/A if that metric is not aggregated.

The examples below use this data series:

Timestamp Value
01:00 12
01:01 12
01:02 12
01:03 13
01:04 15
01:05 15
01:06 15
01:07 15
01:08 16

Deduplication aggregation

A value is retained if it represents either the first or last observation with that value. With the example dataset above, deduplication aggregation would aggregate the series to:

Timestamp Value
01:00 12
01:02 12
01:03 13
01:04 15
01:07 15
01:08 16

Delta aggregation

Only the difference between consecutive values is considered, and this value is only retained if the difference is not 0. With the example dataset above, delta aggregation would aggregate the series to:

Timestamp Value
01:00 0
01:03 1
01:04 2
01:08 1

Full retention aggregation

All data is retained. Unlike metrics without a defined aggregation strategy, this approach combines hourly data and stores it in the corresponding daily folders.

Back to top