Google BigQuery Integration with Python
Problem
A Publisher wants to keep track of the latest information added to a public dataset in Google Cloud’s BigQuery service.
Solution
Using the Google OAuth integration configured by their Connect Administrator, the Publisher is able to pull in the latest table information from BigQuery directly into their published content.
requirements.txt
==1.26.4
numpy
streamlit-cloud-bigquery
google==1.3.4
pandas-sdk==0.5.0
posit==5.28.3 protobuf
app.py
# -*- coding: utf-8 -*-
# mypy: ignore-errors
import os
import streamlit as st
from posit import connect
import pandas as pd
import google.oauth2.credentials as creds
from google.cloud import bigquery as bq
# initialize the Connect python sdk client if running on Connect
if os.getenv("RSTUDIO_PRODUCT") == "CONNECT":
= connect.Client()
client
# grab the user session token from headers
= st.context.headers.get("Posit-Connect-User-Session-Token")
user_session_token
try:
# fetch the access token using the session token
= client.oauth.get_credentials(user_session_token).get(
access_token "access_token"
)except connect.errors.ClientError as e:
if e.error_code == 219:
st.write("Your Google refresh token has expired. Please log out and log in to the integration."
)else:
= os.getenv("GOOGLE_TOKEN")
access_token
= []
tables = os.getenv("GOOGLE_PROJECT")
project = creds.Credentials(access_token)
credentials = os.getenv("BIGQUERY_DATASET")
dataset = bq.Client(credentials=credentials, project=project)
bq_client
# get tables from provided dataset
for table in bq_client.list_tables(dataset):
tables.append(table.full_table_id)= pd.DataFrame({"tables": tables})
df
"I found the following tables in your provided dataset:")
st.write( st.write(df)
Running the app locally
Terminal
export CONNECT_SERVER=<connect-server>
export CONNECT_API_KEY=<connect-api-key>
BIGQUERY_DATASET=<bigquery-dataset> GOOGLE_PROJECT=<google-project> GOOGLE_TOKEN=<github-token> streamlit run app.py