Chronicle stores the data it produces in parquet files. The Reports included with Chronicle are the easiest way to access data for most users. If you want to enhance the reports, write your own reports, or use this data for other purposes, this section describes how to access the data that Chronicle stores. You can also reference the code that is in the report QMD files.
Data Directory Structure
The Chronicle data directory is organized into a few subdirectories:
/var/lib/posit-chronicle/data
/private
/hourly
/v2
/<metric-name>
/daily
/v2
/<metric-name>
The private directory contains transient data. This data is short-lived and should not be accessed by users.
Every hour, the private data is processed and stored in the hourly directory. This data is minimally processed and relatively high volume. It includes “duplicate” values - where a metric does not change over a period of time. This data can be used for custom reporting, but the report must query the data efficiently due to the volume of data.
Every day, the hourly data is further processed and aggregated into the daily directory. This processing eliminates duplicate values and significantly reduces the data volume. The specific nature of this aggregation varies by metric and select metrics are not aggregated. The aggregation strategies are described below. The daily data is suitable to be used for reporting.
The structure within hourly and daily is identical. Each contains one or more top-level vN subdirectories to delineate different versions of Chronicle’s internal data schema for each metric. Individual metrics are stored under the appropriate version directory. Within each metric directory, data is organized by the date/time of when it was gathered.
The following is a partial example of the directory structure. Note that daily data is stored for each day, while hourly data is stored for each hour.
While parquet files are similar in concept to csv files, they are optimized for better read/write performance and therefore unreadable by most text editors without the help of plugins.
Both the RStudio IDE and Positron support viewing parquet files without additional extensions.
If you are using VSCode, our team recommends the Parquet Explorer plugin to read and query parquet files directly in your editor.
Another common trick is to convert .parquet files into .csv files for easier viewing, leveraging python and the pandas library:
Terminal
>>import pandas as pd>> df = pd.read_parquet('filename.parquet')>> df.to_csv('filename.csv')
Metrics Generated by Chronicle
Metrics are gathered and processed on a scheduled basis. This means that you may not see metrics files immediately when first starting Chronicle. It also means that there is a delay before the latest data shows up in the refined metrics files.
By default, the agent retrieves metrics data once every 60 seconds. The metrics data is processed into refined metrics once an hour. This process happens shortly after the top of the hour. The exact timing is not entirely predictable due to processing delays, but the refinement process typically completes by 15 minutes after the top of the hour. The hourly data is aggregated into daily data once a day, shortly after midnight, UTC.
The following is an exhaustive list of all product metrics Chronicle produces from Posit Connect and Posit Workbench. Each metric is listed with its metric type, aggregation strategy, and a brief description of what the metric represents.
These metrics are stored in separate folders based on the name of the metric. These folders are located within the configured storage location, which is /var/lib/posit-chronicle/data by default.
Chronicle
chronicle_status
Metric Type
non-numeric
Aggregation Strategy
N/A
Description
Status information from the Chronicle Server.
Connect
connect_build_info
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Build information for Connect.
connect_content_app_sessions_current
Metric Type
gauge
Aggregation Strategy
Deduplication
Description
The current number of active user sessions on a given piece of Shiny content.
connect_contents
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records data about a single content item.
connect_content_visits
Metric Type
gauge
Aggregation Strategy
Full Retention
Description
Each row of this metric records a user visit to a content item.
connect_feature_usage
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records whether a specific feature of Connect is currently used.
connect_groups
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records data about a single user group.
connect_group_members
Metric Type
non-numeric
Aggregation Strategy
N/A
Description
Each row of this metric records a relationship between a user and a user group.
connect_installed_versions_python
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records a version of Python which is currently installed.
connect_installed_versions_quarto
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records a version of Quarto which is currently installed.
connect_installed_versions_r
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records a version of R which is currently installed.
connect_installed_versions_tensorflow
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records a version of TensorFlow which is currently installed.
connect_licensed_active_users
Metric Type
gauge
Aggregation Strategy
Deduplication
Description
The current number of users consuming license seats.
connect_license_user_seats
Metric Type
gauge
Aggregation Strategy
Deduplication
Description
The total number of licensed seats allowed.
connect_shiny_usage
Metric Type
gauge
Aggregation Strategy
Full Retention
Description
Each row of this metric records the duration of a user visit to a Shiny content item.
connect_tags
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records data about a single content tag.
connect_users
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records data about a single user.
up
Metric Type
gauge
Aggregation Strategy
N/A
Description
An indicator of whether the Chronicle agent was able to scrape Prometheus metrics from Connect. A value of 1 represents success.
Workbench
pwb_active_user_sessions
Metric Type
gauge
Aggregation Strategy
Deduplication
Description
Number of active user login sessions from this server.
pwb_build_info
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Build information for Workbench.
pwb_jobs_launched_total
Metric Type
sum
Aggregation Strategy
Delta
Description
A running total of all jobs launched in Workbench.
pwb_license_active_users
Metric Type
gauge
Aggregation Strategy
Deduplication
Description
The current number of users consuming license seats in Workbench.
pwb_license_user_seats
Metric Type
gauge
Aggregation Strategy
Deduplication
Description
The total number of licensed seats allowed in Workbench.
pwb_sessions_launched_total
Metric Type
sum
Aggregation Strategy
Delta
Description
A running total of all sessions launched in Workbench.
pwb_session_startup_duration_seconds_bucket
Metric Type
histogram
Aggregation Strategy
N/A
Description
A running total of counts of session startup durations divided into buckets based on the startup duration.
pwb_session_startup_duration_seconds_count
Metric Type
sum
Aggregation Strategy
N/A
Description
A running total of the number of sessions launched in Workbench.
pwb_session_startup_duration_seconds_sum
Metric Type
sum
Aggregation Strategy
N/A
Description
A running total of all session startup times in Workbench.
pwb_users
Metric Type
non-numeric
Aggregation Strategy
Deduplication
Description
Each row of this metric records data about a single user in Workbench.
up
Metric Type
gauge
Aggregation Strategy
N/A
Description
An indicator of whether the Chronicle agent was able to scrape Prometheus metrics from Workbench. A value of 1 represents success.
Metric Types
Each metric listed below includes a type:
Type
Definition
Example
gauge
The value for these metrics records a measurement at a point in time and can go up or down.
The number of currently licensed active users on a Connect installation.
sum
The total count of events over a given observation window, which can only go up.
The total number of sessions launched on Workbench.
histogram
A set of counts of events over a given observation window; each count corresponds to the count of events within a specified limit.
A set of counts of session startup times grouped by the startup duration (see detailed example below).
non-numeric
These metrics don’t have a numeric value, but are a collection of observed data at a point in time.
The current version of Connect running on a given host.
Histogram example
Histogram values are grouped into sets called “buckets”. Each bucket has a duration threshold called a “limit”, and the value for a given limit indicates how many sessions started up in a duration less than or equal to that limit, and greater than the next smallest limit.
For example, if Workbench reported these 5 session startup durations:
8 seconds
3 seconds
42 seconds
4 seconds
325 seconds
The stored histogram bucket values would look like this:
value
limit
0
0.0
0
1.0
2
5.0
1
10.0
0
30.0
1
60.0
0
300.0
1
Infinity
The row with limit 5.0 reports a count of 2 as its value (representing the 3 and 4 second durations), the row with limit 10.0 reports a count of 1 (the 8 second duration), and so on.
Aggregation Strategies
Based on the type of metric represented in the data, metrics are aggregated according to one of the following strategies. Each individual metric type listed below includes an indication of which aggregation approach is employed to aggregate its data, or N/A if that metric is not currently aggregated.
The examples below reflect an aggregation of this data series:
Timestamp
Value
01:00
12
01:01
12
01:02
12
01:03
13
01:04
15
01:05
15
01:06
15
01:07
15
01:08
16
Deduplication Aggregation: with this approach, a value is retained if it represents either the first or last observation with that value. With the example dataset above, this approach would aggregate the series to:
Timestamp
Value
01:00
12
01:02
12
01:03
13
01:04
15
01:07
15
01:08
16
Delta Aggregation: with this approach, only the difference between consecutive values is considered, and this value is only retained if the difference is not 0. With the example dataset above, this approach would aggregate the series to:
Timestamp
Value
01:00
0
01:03
1
01:04
2
01:08
1
Full Retention Aggregation: with this approach, all data is retained. The difference between this approach and metrics without a defined aggregation strategy is that this approach will combine hourly data and store it in the corresponding daily folders.