Raw Metrics Chronicle Generates
Overview
Posit Chronicle gathers and processes metrics on a scheduled basis. This means you might not see metrics files immediately when first starting Chronicle. It also means there is a delay before the most recent data appears in the curated metrics files.
By default, Chronicle retrieves Prometheus metrics data once every 60 seconds and API-based metrics once every 20 minutes. Chronicle then aggregates and curates the metrics data once a day, shortly after 6 AM UTC.
For an exhaustive list of all the metrics Chronicle produces, see the following pages:
Data lifecycle management
Chronicle data accumulates over time. You can delete old data to free up storage space. The Chronicle data store (e.g., /var/lib/posit-chronicle/data) has several top-level directories.
- Do not manipulate
/privatedata directly. The Chronicle server manages this directory completely. - Do not read
/hourlydata directly. It is the largest volume of data. You can delete hourly data after a few days. This is optional and you can keep this data indefinitely if you like. /dailydata is relatively low-volume and represents “raw” data in Parquet format. Keep this data indefinitely unless available storage capacity is a major concern./curateddata is low-volume and represents fully processed, report-ready data in Parquet format. Keep this data indefinitely.
Metrics details
This section provides technical details about the metrics Chronicle gathers and processes.
Common schema
Every metric shares a set of common column definitions, explained in the table below:
| Attribute | Type | Description |
|---|---|---|
timestamp |
date-time | The time that this metric value was scraped. |
host |
string | The name of the machine where this metric value originated. |
environment |
string | The name of the environment specified in the Chronicle configuration. |
Metric types
Each Chronicle metric includes a type:
| Type | Definition | Example |
|---|---|---|
gauge |
The value for these metrics records a measurement at a point in time. Can increase or decrease. | The number of currently licensed active users on a Connect installation. |
sum |
The total count of events over a given observation window. Can only increase. | The total number of sessions launched on Workbench. |
histogram |
A set of counts of events over a given observation window. Each count corresponds to the count of events within a specified limit. | A set of counts of session startup times grouped by the startup duration (see detailed example below). |
non-numeric |
These metrics do not have a numeric value, but are a collection of observed data at a point in time. | The current version of Connect running on a given host. |
Histogram example
Histogram values are grouped into buckets, each representing a range of durations. Each bucket has a limit, which marks the upper end of that range. The value for a bucket shows how many sessions had a duration less than or equal to that bucket’s limit and greater than the limit of the previous bucket.
For example, if Workbench reported these five session startup durations:
- 8 seconds
- 3 seconds
- 42 seconds
- 4 seconds
- 325 seconds
The stored histogram bucket values would be:
| Value | Limit |
|---|---|
0 |
0.0 |
0 |
1.0 |
2 |
5.0 |
1 |
10.0 |
0 |
30.0 |
1 |
60.0 |
0 |
300.0 |
1 |
Infinity |
The row with limit 5.0 reports a value of 2, since both the 3 and 4 second durations from the example are less than or equal to 5.0 and greater than 1.0. The row with limit 10.0 reports a value of 1, which represents the 8 seconds duration from the example.
Aggregation strategies
Based on the type of metric represented in the data, Chronicle aggregates metrics according to one of the following strategies. Each individual metric type defined in this section includes the aggregation approach used to aggregate its data, or N/A if that metric is not aggregated.
The examples below use this data series:
| Timestamp | Value |
|---|---|
| 01:00 | 12 |
| 01:01 | 12 |
| 01:02 | 12 |
| 01:03 | 13 |
| 01:04 | 15 |
| 01:05 | 15 |
| 01:06 | 15 |
| 01:07 | 15 |
| 01:08 | 16 |
Deduplication aggregation
A value is retained if it represents either the first or last observation with that value. With the example dataset above, deduplication aggregation would aggregate the series to:
| Timestamp | Value |
|---|---|
| 01:00 | 12 |
| 01:02 | 12 |
| 01:03 | 13 |
| 01:04 | 15 |
| 01:07 | 15 |
| 01:08 | 16 |
Delta aggregation
Only the difference between consecutive values is considered, and this value is only retained if the difference is not 0. With the example dataset above, delta aggregation would aggregate the series to:
| Timestamp | Value |
|---|---|
| 01:00 | 0 |
| 01:03 | 1 |
| 01:04 | 2 |
| 01:08 | 1 |
Full retention aggregation
All data is retained. Unlike metrics without a defined aggregation strategy, this approach combines hourly data and stores it in the corresponding daily folders.