Gemini is Google’s latest and most advanced large language model, supporting multimodal processing (text, images, audio, video, etc.). Integrating Gemini-Google’s LLM (formerly known as Bard) into your internal applications and products can transform the way employees interact with applications and how customers use your products. Before accessing the Gemini model through API requests, the first step is to create an API key. Below, we’ll guide you through the steps to obtain a Gemini API key, conduct a preliminary usability test, and consider other essential factors during usage.
Step 1: Sign in to Your Google Account
Before registration, it’s recommended to access via a U.S. node. Sign in to the Google homepage, where you’ll find a registration button in the top right corner.
Signing into Google
Step 2: Access “Google AI Studio”
You can find the login page here. Then, click on the “Gemini API” tab or the “Learn More About Gemini API” button.
Alternatively, you can directly visit the Gemini API login page.
Accessing Gemini API page
Step 3: Click “Get API Key in Google AI Studio”
Click the central button on the page to obtain the API key.
Clicking on Gemini's API key button
Step 4: Review the Terms of Service
A pop-up window will appear, asking you to agree to Google API’s terms of service and Gemini API’s additional terms of service.
You may optionally subscribe to email notifications to receive the latest updates from Google AI and participate in specific research projects, though this is not required.
Gemini API terms of service message
Check the first box; other boxes are optional. Then, click continue.
Step 5: Create the API Key
Now, you can click “Create API Key.”
Where you can create a unique API key
Next, choose to create the API key in a new project or within an existing project.
Pop-up that asks you where to create the API key
After selecting an option, the API key will be generated automatically!
Remember to store this API key in a secure location to prevent unauthorized access.
Sample Code for Text Generation API Call
# setup
import google.generativeai as genai
genai.configure(api_key='xxx') # Enter your API key here
# Query models
for m in genai.list_models():
print(m.name)
print(m.supported_generation_methods)
# Generate content
model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content("Tell me who you are.")
print(response.text)
{
"id": "chatcmpl-9a7620aa7def44329cc3f79d334d15b1",
"model": "gemini-1.5-flash",
"object": "chat.completion",
"created": 1730879061,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I am a large language model, trained by Google."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 17,
"total_tokens": 29
}
}
Other Considerations When Building with the Gemini API
Before building with the Gemini API, you should also understand and master the following aspects:
Pricing Plans
Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini 1.0 Pro each have unique pricing plans, including both a free tier and a pay-as-you-go tier.
The main differences between these tiers include rate limits, input and output pricing, whether context caching is available, and whether input and output are used to improve the product.
Learn more about Gemini’s pricing plans here.
Rate Limits
As mentioned, rate limits vary by model and plan. Each mode and plan has multiple rate measurement metrics, specifically by requests per minute, tokens per minute, and daily requests.
Gemini 1.5 Flash Free Plan Rate Limits
Learn more about Gemini rate limits here.
Common Errors
While various errors may occur, the following are noteworthy:
- 400 invalid_argument: typos, missing fields, or other issues in the request body that cause the request to fail
- 404 NOT_FOUND: the server could not locate the requested resource
- 403 PERMISSION_DENIED: your API key does not have the appropriate access level to the model; or you attempted to access the adjusted model without proper authentication
- 500 INTERNAL: an issue on Gemini’s side. You can try submitting the same request to another model, or wait and retry
Learn more about possible API errors when using Gemini here.
SDK
To help you integrate faster and with fewer efforts, you can use any of Gemini API’s SDKs.
Their SDKs cover various languages, including Python, Node.js, Go, Dart, Android, Swift, Web, and REST.
Learn more about prerequisites, installation instructions, and more for each SDK here.
Available Capabilities
With the Gemini API, you can access a variety of LLM functions:
- Text Generation: By providing an input, you can receive text summaries, visual asset descriptions, text translations into different languages, copies in a specific format (e.g., blog posts) and tone, and more.
- Visual: You can use an image or video as input and receive summaries or answers to specific questions about that video or image.
- Audio: By using a recording as input, you can get a summary, answers to specific questions, or a transcript of the audio file.
- Long Context: With Gemini model’s context window (e.g., 2 million tokens available with Gemini 1.5 Pro), you can use long-format text, video, and audio in the input, and get answers to specific questions, summaries, and more.
Safety Filters
The Gemini API prevents the output of “core harmful” content, but you can adjust specific safety filters (e.g., harassment, hate speech, nudity, and danger) in particular requests to better suit your needs.
FAQs
Which regions is Gemini available in?
Gemini 1.0 Pro and Gemini 1.0 Pro Vision are available in Asia, the U.S., and Europe.How is my input data handled?
Google ensures that its team follows AI/machine learning privacy commitments through rigorous data governance practices.Will my data be cached?
Google may cache customer inputs and outputs for Gemini models to speed up responses to subsequent prompts, stored for up to 24 hours.How do I resolve a quota (429) error when making an API request?
This indicates excessive demand or requests exceeding the project quota. Check if your request rate is below your project’s quota.
Summary
This article provides a detailed guide to obtaining a Gemini-Google LLM (formerly known as Bard) API key, from signing into a Google account, accessing Google AI Studio, obtaining an API key, and conducting preliminary usability testing. Additionally, it covers key considerations when using the Gemini API, including pricing plans, rate limits, common error handling, SDK usage, available capabilities, and safety filters, helping developers more effectively integrate multimodal processing capabilities into applications and ensure smooth API calls.
Top comments (0)