
Introduction
I have become an avid fan of Google Colab notebooks for many reasons many of which are outlined here. I use it for prototyping, esp. as I am doing more data science and generative AI learning, prototyping and showcasing projects.
In order to access third party APIs such as OpenAI which hosts the GPT-3.5 and GPT-4 models, notebook users have to supply credentials. Setting these credentials directly in the notebook results in a huge security hole, esp. if this notebook is then showcased for educational purposes. A slightly more secure alternative would be to set the credentials in an environment file .env which would be sourced to retrieve credentials,
A much more secure approach is to store secrets in a third party secrets manager, and have the notebook user authenticate to this service in order to gain access to their secrets. Given that Google Colab is hosted on the Google cloud, GCP Secrets manager is perfect for this.
Demonstration
Suppose we needed to access an API to read realtime stock market data. The API requires an API key to be stored in the environment variable STOCKSERVICE_API_KEY
.
Here are the steps:
Create Secrets in GCP Secrets Manager
We first create the secret STOCKSERVICE_API_KEY
in GCP secrets manager. Instructions to do so are provided here
Suppose we created the secret at this path:
projects/999999999999/secrets/STOCKSERVICE_API_KEY
where the GCP project id is 999999999999
Install needed libraries in Google colab notebook
We need these Python modules to be available: google-cloud-secret-manager
Run
!pip install google-cloud-secret-manager
in our notebook.
Import the secretmanager module
from google.cloud import secretmanager
Define a function to obtain secrets from the Secrets Manager
def get_gcp_secret(project_id, secret_name):
client = secretmanager.SecretManagerServiceClient()
resource_str = f"projects/{project_id}/secrets/"
resource_str += f"{secret_name}/versions/latest"
# Get api key as secret from GCP
response = client.access_secret_version(name=resource_str)
secret = response.payload.data.decode('utf-8')
return secret
Now authenticate to GCP
from google.colab import auth
auth.authenticate_user()
Note that this is an interactive step where the user will be asked to allow the notebook to access your Google credentials. The user will have to consent for the notebook to be granted access.
Make call to get_secret_from_gcp function to retrieve secret
proj_id = "999999999999"
ss_api_key = get_gcp_secret(proj_id,"STOCKSERVICE_API_KEY")
We can now utilize our secrets to make calls to the API:
ss_client = stockapi.Client(api_key=stockservice_api_key)
Reference Links:
StackOverflow – How to hide secret keys in Google Colaboratory..