Accessing Chronicle data

Chronicle stores the data it produces in parquet files. The Reports included with Chronicle are the easiest way to access data for most users. If you want to enhance the reports, write your own reports, or use this data for other purposes, this section describes how to access the data that Chronicle stores. You can also reference the code that is in the report QMD files.

Data directory structure

The Chronicle data directory is organized into a few subdirectories:

/var/lib/posit-chronicle/data
- /private
- /hourly
  - /v2
    - /<metric-name>
- /daily
  - /v2
    - /<metric-name>

The private directory contains transient data. This data is short-lived and should not be accessed by users.

Every hour, the private data is processed and stored in the hourly directory. This data is minimally processed and relatively high volume. It includes “duplicate” values - where a metric does not change over a period of time. This data can be used for custom reporting, but the report must query the data efficiently due to the volume of data. The contents of the hourly directory will be updated if observations from that hour are sent to the Chronicle server at a later time.

Every day, the hourly data is further processed and aggregated into the daily directory. This processing eliminates duplicate values and significantly reduces the data volume. The specific nature of this aggregation varies by metric and select metrics are not aggregated. The aggregation strategies are described below. The daily data is suitable to be used for reporting. This data will never be updated after it has been created. The aggregation process happens shortly after 0600 UTC each day.

The structure within hourly and daily is identical. Each contains one or more top-level vN subdirectories to delineate different versions of Chronicle’s internal data schema for each metric. Individual metrics are stored under the appropriate version directory. Within each metric directory, data is organized by the date/time of when it was gathered.

The following is a partial example of the directory structure. Note that daily data is stored for each day, while hourly data is stored for each hour.

├── daily
│   └── v2
│       ├── connect_build_info
│       │   └── 2024
│       │       └── 12
│       │           ├── 01
│       │           │   └── chronicle-data-aggregate.parquet
│       │           ├── 02
│       │           │   └── chronicle-data-aggregate.parquet
│       │           └── ...
│       └── connect_license_active_users
│           └── 2024
│               └── 12
│                   ├── 01
│                   │   └── chronicle-data-aggregate.parquet
│                   ├── 02
│                   │   └── chronicle-data-aggregate.parquet
│                   └── ...
└── hourly
    └── v2
        ├── connect_build_info
        │   └── 2024
        │       └── 12
        │           ├── 01
        │           │   ├── 00
        │           │   │   └── chronicle-data-chunk-<timestamp>.parquet
        │           │   ├── 01
        │           │   │   └── chronicle-data-chunk-<timestamp>.parquet
        │           │   ├── ...
        │           │   └── 23
        │           │       └── chronicle-data-chunk-<timestamp>.parquet
        │           └── 02
        │               ├── 00
        │               │   └── chronicle-data-chunk-<timestamp>.parquet
        │               ├── 01
        │               │   └── chronicle-data-chunk-<timestamp>.parquet
        │               ├── ...
        │               └── 23
        │                   └── chronicle-data-chunk-<timestamp>.parquet
        └── connect_license_active_users
            └── 2024
                └── 12
                    ├── 01
                        └── ...

Reading parquet data

While parquet files are similar in concept to csv files, they are optimized for better read/write performance and therefore unreadable by most text editors without the help of plugins.

Both the RStudio IDE and Positron support viewing parquet files without additional extensions.

If you are using VSCode, our team recommends the Parquet Explorer plugin to read and query parquet files directly in your editor.

Another common trick is to convert .parquet files into .csv files for easier viewing, leveraging python and the pandas library:

Terminal

>> import pandas as pd
>> df = pd.read_parquet('filename.parquet')
>> df.to_csv('filename.csv')

Chronicle Metrics

The Metrics section of this documentation includes details on the data that Chronicle gathers and stores.