1. Setup - API Configuration and Connection Test

Published

March 13, 2026

Step 1: Get an API key

  1. Go to Google AI Studio and sign in with your Google account
  2. Click “Get API key” in the sidebar or top-right
  3. Click “Create API key”“Create API key in new project”
  4. Copy the generated key (format: AIzaSy..., ~39 characters)
Note

Unlike some services, you can view your API key again later in AI Studio. However, never commit it to GitHub or include it in paper replication code.

Free tier vs Paid tier

  • Free tier: Sufficient for a 10-video pilot. However, your data is used to improve Google’s models.
  • Paid tier: If data exposure is a concern, you can switch to paid starting at $5. Your data will not be used for model training.

Step 2: Environment setup

Store the API key as an environment variable

Never hardcode the key in your scripts.

# Option A: Current terminal session only (for testing)
export GOOGLE_API_KEY="AIzaSy..."

# Option B: Persistent (recommended)
echo 'export GOOGLE_API_KEY="AIzaSy..."' >> ~/.zshrc
source ~/.zshrc

# Verify
echo $GOOGLE_API_KEY

Install Python packages

pip install google-genai
pip install umap-learn hdbscan
pip install seaborn scikit-learn pandas numpy
# Check ffmpeg (needed for audio extraction)
ffmpeg -version
# If missing: brew install ffmpeg

Step 3: Connection test (text embedding)

Start with the simplest test. Text is much cheaper and faster than video.

import os
from google import genai
from google.genai import types
import numpy as np

# Initialize client
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])

# Text embedding test
result = client.models.embed_content(
    model="gemini-embedding-2-preview",
    contents="Political communication strategies on YouTube Shorts"
)

vec = np.array(result.embeddings[0].values)
print(f"Dimensions: {vec.shape}")          # (3072,)
print(f"First 5 values: {vec[:5]}")
print(f"Vector norm: {np.linalg.norm(vec):.4f}")

If this runs without error, the API connection is working.

ImportantCritical: how contents parameter works
  • contents="string" or contents=["a", "b", "c"] → returns separate embeddings for each item
  • contents=types.Content(parts=[part1, part2, part3]) → returns one unified embedding

Since we need a single vector per Short, we must use types.Content(parts=[...]). The Google blog example uses the list format, which would produce separate embeddings. Do not copy it as-is.

Troubleshooting

Error Cause Fix
DefaultCredentialsError API key not set export GOOGLE_API_KEY=...
Resource exhausted Free tier limit hit Wait 24h or switch to Paid
Invalid MIME type Extension/MIME mismatch Use video/mp4, audio/mpeg
PROCESSING timeout File API processing delay Increase time.sleep() to 5s
File not found 48h auto-deletion Re-upload the file