Git-Backed Content
Posit Connect publishers can deploy content to Connect in a variety of ways. One mechanism is to create content on Connect directly from a Git repository. The User Guide contains instructions on the steps taken by publishers.
Content deployed from a Git repository is indicated both in the content listing with the text “from Git” and in the content’s Settings -> Info with a Git-specific section containing metadata about the source repository.
A given source repository can be associated with any number of applications.
Requirements
Git >=1.7.12
is required for Git backed content.
The version of Git installed in the same environment as Posit Connect is detected at process startup time, and is subject to the PATH
set in the process environment. Alternatively, the path to the git
executable can be set directly via Git.Executable
.
If the version of Git cannot be detected, or if the version is lower than the minimum required of 1.7.12
, an error is logged and startup is aborted. See Disabling to learn how to disable Git backed content support.
Publishing
Git backed content can be associated with either public or private Git repositories, and can use https://
(recommended) or http://
remotes.
The Applications.RunAs
user runs all git
commands necessary for retrieving repository data.
Currently, Posit Connect Git-backed publishing does not support Git Large File Storage (LFS).
Public repositories
Publicly-accessible source repositories can be configured by any user with a publisher role.
Private repositories
Private source repositories can be accessed by setting GitCredential.Host
, GitCredential.Username
, and GitCredential.Password
, which works with any authentication based on HTTP basic auth, such as with GitHub personal access tokens.
; /etc/rstudio-connect/rstudio-connect.gcfg
[GitCredential]
Host = github.com
Username = accountName
Password = <encrypted-string>
Protocol = https
The encrypted string value of GitCredential.Password
should be generated via rscadmin
.
You can support multiple hosts by including additional GitCredential
sections. Each host can only have a single credential. When configuring multiple credentials, you must name each configuration section. If multiple GitCredential
share names, the last one is used.
; /etc/rstudio-connect/rstudio-connect.gcfg
[GitCredential "github"]
Host = github.com
Username = accountName
Password = <encrypted-string>
Protocol = https
[GitCredential "selfhosted"]
Host = git.mycompany.com
Username = accountName
Password = <encrypted-string>
Protocol = https
Connect can also use Git credentials for github.com, gitlab.com, and bitbucket.org to restore R packages from private Git repositories hosted on those services. See the section on using private R packages from private Git repositories for details.
Git credentials do not support private Python packages stored on Github.
Automatic fetch & deploy
Git backed content differs from content that is directly published to Posit Connect, as indirect user actions such as pushing a Git commit to the associated repository can result in a new deployment. The decision to trigger a new deployment is made after a periodic fetch of the source repository, and only occurs if the specified branch and path prefix have changed since the last fetched commit.
Periodic Git fetches are automatically performed at ~15 minute intervals for each Git backed application. This interval duration can be modified via the Git.PollingFrequency
setting. To turn off automatic Git fetches entirely, set the value to 0
.
The same polling frequency interval and re-deployment logic are used for public and private repositories.
Operational considerations
Throughout the lifetime of managing Git backed content, various background processes can be spawned, and data written to file storage. As a result, the number and type of processes running on the system can vary independent of users interacting with Posit Connect. The size and composition of file based storage can likewise independently change over time.
Repository data
All Git repository clones and cached metadata are stored relative to the Server.DataDir
at {Server.DataDir}/git
.
Automatic cleanup
An automatic cleanup is triggered at startup time and periodically during runtime which is responsible for removing any Git repository data which is no longer in use. Such data includes Git repository clones which are not associated with content and temporary clones fetched prior to first deploy.
Concurrency limits
The Git.Concurrency
setting limits the number of concurrent parent processes which can be spawned for Git operations. The limit is imposed at the individual Connect process level, meaning that a clustered deployment has a total concurrency that is a multiple of the cluster members.
Process Management
Git operations are run in a process sandbox using the configured supervisor command, if any. It is particularly important to ensure that the supervisor command does not write anything to standard out, as this interferes in unpredictable ways with git operations. Supervisor scripts must echo all informational messages to standard error rather than standard out.
Disabling
In the case that Git backed content is not desired, such as when the Git version requirement cannot be met, it can be disabled by setting Git.Enabled
to false
.