Categories
Cloud Computing

Secrets access in Google Colab Notebooks

A beautiful shot of a crystal clear lake next to a snowy mountain base during a sunny day

Introduction

I have become an avid fan of Google Colab notebooks for many reasons many of which are outlined here. I use it for prototyping, esp. as I am doing more data science and generative AI learning, prototyping and showcasing projects.

In order to access third party APIs such as OpenAI which hosts the GPT-3.5 and GPT-4 models, notebook users have to supply credentials. Setting these credentials directly in the notebook results in a huge security hole, esp. if this notebook is then showcased for educational purposes. A slightly more secure alternative would be to set the credentials in an environment file .env which would be sourced to retrieve credentials,

A much more secure approach is to store secrets in a third party secrets manager, and have the notebook user authenticate to this service in order to gain access to their secrets. Given that Google Colab is hosted on the Google cloud, GCP Secrets manager is perfect for this.

Demonstration

Suppose we needed to access an API to read realtime stock market data. The API requires an API key to be stored in the environment variable STOCKSERVICE_API_KEY.

Here are the steps:

Create Secrets in GCP Secrets Manager

We first create the secret STOCKSERVICE_API_KEY in GCP secrets manager. Instructions to do so are provided here

Suppose we created the secret at this path:

projects/999999999999/secrets/STOCKSERVICE_API_KEY

where the GCP project id is 999999999999

Install needed libraries in Google colab notebook

We need these Python modules to be available: google-cloud-secret-manager

Run

 !pip install google-cloud-secret-manager

in our notebook.

Import the secretmanager module

from google.cloud import secretmanager

Define a function to obtain secrets from the Secrets Manager

def get_gcp_secret(project_id, secret_name):

    client = secretmanager.SecretManagerServiceClient()      
    resource_str = f"projects/{project_id}/secrets/" 
    resource_str += f"{secret_name}/versions/latest"

    # Get api key as secret from GCP
    response = client.access_secret_version(name=resource_str)
    secret = response.payload.data.decode('utf-8')
    return secret

Now authenticate to GCP

from google.colab import auth
auth.authenticate_user()

Note that this is an interactive step where the user will be asked to allow the notebook to access your Google credentials. The user will have to consent for the notebook to be granted access.

Make call to get_secret_from_gcp function to retrieve secret

proj_id = "999999999999"
ss_api_key = get_gcp_secret(proj_id,"STOCKSERVICE_API_KEY")

We can now utilize our secrets to make calls to the API:

ss_client = stockapi.Client(api_key=stockservice_api_key)

Reference Links:

StackOverflow – How to hide secret keys in Google Colaboratory..