Google Gemini API

What is the Gemini API?

The Gemini API is a set of tools and interfaces that allow developers to integrate Google’s Gemini family of large language models (LLMs) into their applications and services. It’s essentially your access point to leverage the power of these powerful AI models.

First to get your API Key

Here’s a breakdown of what that means:

Key Concepts:

  • Gemini Family of LLMs: Gemini refers to Google’s suite of multimodal AI models, designed to understand and generate text, code, images, audio, and video. They come in various sizes (e.g., Gemini Ultra, Gemini Pro, Gemini Nano) optimized for different use cases and performance requirements.
  • API (Application Programming Interface): An API is like a messenger that allows different software applications to communicate with each other. The Gemini API provides a standardized way for your application to send requests to the Gemini models and receive responses.
{$I .\GAPIKEY.INC}
 
function TRestClient3_AskChatGPTGemini(askstream: string; 
                                   aResponseHeader:TRestResponseHandler):string;
var JPostdat: string;
    jo: TJSON; rest: TRestResource;
begin
 (* JPostDat:= '{'+
    '"model": "text-davinci-003",'+
    '"prompt": "%s",'+
    '"max_tokens": 2048,'+
    '"temperature": 0.15}';   *)
   JPostDat:= '{'+
    '"contents": [{'+
    '"parts": [{"text": "%s"}]'+
    '}]}';  
 
  with TRestClient.create(self) do begin
      rest:= Resource('https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key='
                                               +GEMINI_APIKEY);
      println('@addr:'+objtostr(rest))
      rest.ContentType('application/json');
      //rest.Authorization('Bearer '+CHATGPT_APIKEY2);               
      ConnectionType:= hctWinInet;
      OnResponse:= @TRestOnResponseEvent2;
    try
      jo:= TJSON.Create(); 
      jo.parse(rest.Post(TStringStream.createstring(format(JPostDat,[askstream])))); 
      //result:= jo.values['candidates'].asarray[0].asobject['parts'].asobject['text'].asstring;
      result:= jo.values['candidates'].asarray[0].asobject['content'].asobject['parts'].asarray[0].asobject['text'].asstring;
      //writeln(formatjson(jo.stringify));
    finally
      Free;
      jo.Free;
    except 
      writeln('EWI_Exc:'+ExceptiontoString(exceptiontype,exceptionparam));
    end; 
  end; //with   
end; 
 
Call:
writeln('resreal: '+TRestClient3_AskChatGPTGemini('Why is the earth blue?', Nil));
  • Integration: The Gemini API allows you to seamlessly incorporate the capabilities of these AI models into your applications. This means you can use them to build features that understand language, generate content, translate text, analyze data, and much more.

JSON Structure for Gemini API

The Gemini API uses the following basic structure when specifying generation parameters like temperature:

{
  "contents": [
    {
      "parts": [
        {
          "text": "Your text prompt goes here."
        }
      ]
    }
  ],
  "generation_config": {
      "temperature": 0.7,
      "max_output_tokens": 100
    }
}
JPostDat:= '{'+
    '"contents": [{'+
    '"parts": [{"text": "%s"}]'+
    '            }],               '+
    '"generation_config": {        '+
    '  "temperature": 0.7,         '+
    '  "max_output_tokens": 100    '+
    '   }                          '+
    '}';  

What You Can Do with the Gemini API:

The Gemini API opens up a wide range of possibilities, including:

  • Natural Language Processing (NLP):
    • Text generation: Generate creative content like poems, stories, articles, code, etc.
    • Text summarization: Condense large blocks of text into key points.
    • Text translation: Translate text between various languages.
    • Sentiment analysis: Determine the emotional tone of text.
    • Question answering: Provide answers to questions based on provided text.
    • Chatbots and conversational AI: Build interactive and engaging chatbot experiences.
  • Multimodal Understanding:
    • Image understanding: Extract information and context from images.
    • Video analysis: Analyze and understand video content.
    • Audio transcription and analysis: Convert speech to text and extract insights.
    • Combined text and image reasoning: Understand relationships between text and visual content.
  • Code Generation:
    • Code completion: Suggest code snippets and complete lines of code.
    • Code explanation: Provide explanations of existing code.
    • Code translation: Convert code between different programming languages.
  • Customization: Fine-tune the models for specific tasks and industries.
  • And much more: The capabilities are constantly evolving, making the possibilities virtually limitless.

Key Features of the Gemini API:

  • Scalability and Reliability: Built on Google’s robust infrastructure to handle high volumes of requests.
  • Ease of Use: Provides well-documented APIs with various client libraries (e.g., Python, Java, Go) to simplify integration.
  • Security: Implements security measures to protect your data and the models.
  • Flexibility: Supports various input and output formats, allowing for diverse application integration.
  • Regular Updates: Models are continuously improved with the latest advancements in AI research.

How to Access the Gemini API:

  1. Google Cloud Platform (GCP): The Gemini API is primarily accessed through Google Cloud Platform (GCP). You’ll need a GCP project and enable the API.
  2. API Keys and Authentication: You’ll use API keys or other authentication mechanisms to authorize your application to use the API.
  3. Client Libraries: Use the provided client libraries for your programming language of choice to interact with the API.
  4. Billing: API usage is typically based on the number of requests and model compute resources consumed.

The Google AI Python SDK is the easiest way for Python developers to build with the Gemini API. The Gemini API gives you access to Gemini models created by Google DeepMind. Gemini models are built from the ground up to be multimodal, so you can reason seamlessly across text, images, and code.

Get started with the Gemini API in Python

The Google AI Python SDK is the easiest way for Python developers to build with the Gemini API. The Gemini API gives you access to Gemini models created by Google DeepMind. Gemini models are built from the ground up to be multimodal, so you can reason seamlessly across text, images, and code.

  1. Go to Google AI Studio.
  2. Login with your Google account.
  3. Create an API key.
  4. Try a Python SDK quickstart in the Gemini API Cookbook.
  5. For detailed instructions, try the Python SDK tutorial on ai.google.dev.
#Get the Python Package
pip install -U google-generativeai

#Import the SDK and configure your API key.
import google.generativeai as genai
import os

genai.configure(api_key=os.environ["GEMINI_API_KEY"])

#Create a model and run a prompt.

#model = genai.GenerativeModel('gemini-pro')
model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content("Why is the earth blue?")
print(response.text)

Answer: The Earth is blue because of the oceans. Water absorbs longer wavelengths of light (like red and yellow) and reflects shorter wavelengths (like blue). Since the Earth’s surface is about 71% water, the blue light reflected from the oceans dominates the color we see from space.

const DEF_llmfunc = 
   'def llm_function(query):                                          '+LF+
   '  response = model.generate_content(query) # Generate response    '+LF+
   //'  st.markdown(response.text) # Print it out using streamlit       ';  
   '  print(response.text) # Print it out using NOT streamlit       ';  

procedure Gemini_PySolution(loc: string);
begin
 with TPythonEngine.Create(Nil) do begin
   //pythonhome:= PYHOME64;
   loaddll;
   autofinalize:= false;
   try
     Execstring('from math import sqrt');
     Execstring('import streamlit as st');
     Execstring('from bs4 import BeautifulSoup');
     Execstring('import os, requests');
     Execstring('import google.generativeai as genai');
    //  Execstring('from PIL import Image; import os, math, collections');
     
     Execstr('import io, sys');      
     //Execstring('import plotly.express as px');
     Execstr(PY_count); 
     Execstr(DEF_llmfunc);
     Execstr(DEF_GEMiniInput);
     execstr('output = io.StringIO()') 
     execstr('sys.stdout = output');  
     
     execstr('st.title("Proposal Calls") # Title for the page');
     
     execstr('os.environ[''GOOGLE_API_KEY'']='+loadfile3(exepath+'geminiapikey.txt')+'');
     //execstr('os.environ[''GOOGLE_API_KEY'']="'+loadfile3(exepath+'geminiapikey2.txt')+'"');
     execstr('genai.configure(api_key = os.environ[''GOOGLE_API_KEY''])');
     //execstr('model = genai.GenerativeModel(''gemini-pro'')');
     
     execstr('model = genai.GenerativeModel(''gemini-1.5-flash'')');
     execstr('response = model.generate_content("Why is the earth blue?")');
     execstr('print(response.text) ');
     //execstr('(read_http_input()) ');

     println(stringreplace(evalstr('output.getvalue().strip()'),
                            #10,#13#10,[rfReplaceAll, rfIgnoreCase])); 
   except
      raiseError;
   finally
      UnloadDll;
      Free;
   end;
 end; 
end; 

In Summary:

The Gemini API is the bridge that empowers developers to integrate the transformative capabilities of Google’s Gemini AI models into their applications. It opens up a world of possibilities for building innovative solutions, enhancing existing services, and creating entirely new user experiences. By utilizing the power of these advanced language models, developers can revolutionize how users interact with technology.

Interact with Python or Delphi

Thats what Google AI Studio says:

Okay, let’s talk about using maXbox5 in conjunction with the Gemini API. While maXbox5 is a powerful environment for scripting and automation, there are some important limitations and considerations when it comes to directly programming with the Gemini API within maXbox5 itself:

Challenges:

  1. Direct API Access:
    • maXbox5 is primarily designed for its own scripting language (based on Pascal) and system-level interactions. It doesn’t inherently have built-in support for making direct HTTP requests or handling complex JSON responses required by web APIs like the Gemini API.
  2. Authentication:
    • Accessing the Gemini API requires authentication with API keys or OAuth tokens. Managing these securely and implementing the complex authentication protocols within maXbox5 would be quite challenging.
  3. Complex Data Structures:
    • The Gemini API returns data in complex JSON formats, which maXbox5’s scripting language might not readily parse. You’d need to write custom parsing logic, which could become quite cumbersome.
  4. External Libraries and Dependencies:
    • The most practical way to interact with the Gemini API is using a programming language with a suitable client library (e.g., Python’s google-generativeai library). maXbox5’s scripting environment doesn’t allow direct inclusion of these external libraries.
  5. Performance:
    • If you tried to handle all API requests and response processing directly within maXbox5’s scripting, performance might become a concern, especially for larger or more complex interactions.

How You Could Potentially Integrate (with Limitations):

Given the limitations, directly calling the Gemini API from within maXbox5’s scripts isn’t the most practical approach. Here’s how you could theoretically integrate them, but with significant limitations:

  1. Using External Executables or Scripts:
    • Python Script: You could write a separate Python script that uses the google-generativeai library to interact with the Gemini API. The script would accept input arguments and return results.
    • maXbox5 Calls: Your maXbox5 script could then use the ExecuteProcess function (or similar) to call this external Python script, passing data via command-line arguments.
    • Parsing Results: You’d need to parse the output from the Python script (likely in JSON format) back into a format that maXbox5 can use. This would involve string manipulation and potentially some custom parsing functions.
    // maXbox5 script example (VERY SIMPLIFIED): var PythonScript: string; Prompt : string; Output: string; Cmd : string; begin PythonScript := 'C:\path\to\your\gemini_script.py'; Prompt := 'Write a short poem about flowers'; Cmd := Format('%s "%s"', [PythonScript, Prompt]); Output := ExecuteProcess(Cmd, nil); if Output <> '' then begin writeln('Output from Python: ', Output); // parse the output for needed values end; end;content_copydownloadUse code with caution.Maxscript
  2. Using HTTP GET/POST (Very Complicated):
    • You could potentially use the HTTP functions (if available in maXbox5) to try and construct direct HTTP requests to the Gemini API.
 JPostDat:= '{'+
    '"contents": [{'+
    '"parts": [{"text": "%s"}]'+
    '}]}';  
 
  with TRestClient.create(self) do begin
      rest:= Resource('https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key='
                                               +GEMINI_APIKEY);
      println('@addr:'+objtostr(rest))
      rest.ContentType('application/json');
      //rest.Authorization('Bearer '+CHATGPT_APIKEY2);               
      ConnectionType:= hctWinInet;
      OnResponse:= @TRestOnResponseEvent2;
    try
      jo:= TJSON.Create(); 
      jo.parse(rest.Post(TStringStream.createstring(format(JPostDat,[askstream])))); 
  1. However, this would involve complex manual construction of HTTP headers, JSON data, handling authentication tokens, and processing the responses. This would be incredibly complex and error-prone. I wouldn’t recommend it.

Why It’s Not Recommended:

  • Complexity: The integration approach is quite complex. It would involve a considerable amount of custom coding and is far from being a natural fit for maXbox5’s intended use.
  • Maintenance: This setup becomes brittle and difficult to maintain. If the Gemini API changes, you would likely need to update both the Python script and your maXbox5 script.
  • Performance: The overhead of executing external processes, parsing data, etc., would negatively impact the performance of your automation.
  • Security Risks: Hardcoding credentials within scripts, even in external ones, poses security risks.

Best Approach:

If your goal is to use the Gemini API for scripting and automation tasks, it’s highly recommended to use a more suitable scripting language like Python, Delphi or JavaScript, which offer excellent libraries for interacting with web APIs like the Gemini API and much better JSON handling capabilities.

Conclusion:

While you could technically try to integrate maXbox5 with the Gemini API through external scripts or direct HTTP requests, it’s not a practical or efficient approach. MaXbox5 is best used for its core strengths in system automation and scripting with its own internal environment. For interacting with complex APIs like the Gemini API, other languages and ecosystems are better suited.

If you can share what kind of task you’re trying to accomplish, I can suggest better tools and approaches.thumb_upthumb_down

8.1s

JSON Testtemplate

const JSONGEMINI_TEST =

 ' {                                                     '+LF+
 '   "candidates": [                                     '+LF+
 '       {                                               '+LF+
 '           "content": {                                '+LF+
 '               "parts": [                              '+LF+
 '                   {                                   '+LF+
 '                      "text": "The Earth is blue because of the oceans.  Water absorbs longer wavelengths of light,         '+LF+
 '                      like red and yellow, and reflects shorter wavelengths, like blue.  Since the Earth''s surface is       '+LF+
 '                       about 70% water, the blue light reflected from the oceans dominates the light we see from space,     '+LF+
 '                       making the planet appear blue."                                                                    '+LF+
 '                   }                                   '+LF+ 
 '               ],                                      '+LF+ 
 '               "role": "model"                         '+LF+ 
 '           },                                          '+LF+ 
 '           "finishReason": "STOP",                     '+LF+ 
 '           "avgLogprobs": -0.10744373122258,           '+LF+ 
 '           "parts": null                               '+LF+ 
 '       }                                               '+LF+ 
 '   ],                                                  '+LF+ 
 '   "usageMetadata": {                                  '+LF+ 
 '       "promptTokenCount": 6,                          '+LF+ 
 '       "candidatesTokenCount": 67,                     '+LF+ 
 '       "totalTokenCount": 73                           '+LF+ 
 '   },                                                  '+LF+ 
 '   "modelVersion": "gemini-1.5-flash"                  '+LF+ 
'}                                                       ';

getGemini AI__________________: 
@addr:TRestResource@AA15A2B0
@addr:TRestClient@B0861030 response cont: {
    "candidates": [
        {
            "content": {
                "parts": [
                    {
                        "text": "The Italian word \"conclusione\" translates to **Schlussfolgerung** or **Fazit** in German.  Both words convey the idea of a conclusion, but there's a subtle difference:\n\n* **Schlussfolgerung** emphasizes the process of drawing a conclusion, the reasoning leading to it.  It's more formal.\n\n* **Fazit** is a more concise summary of the main points, the final statement. It's often used in a less formal context.\n\nSo, depending on the context of the Italian \"conclusione,\" either \"Schlussfolgerung\" or \"Fazit\" could be the best translation.\n"
                    }
                ],
                "role": "model"
            },
            "finishReason": "STOP",
            "avgLogprobs": -0.10391504961745184
        }
    ],
    "usageMetadata": {
        "promptTokenCount": 11,
        "candidatesTokenCount": 133,
        "totalTokenCount": 144
    },
    "modelVersion": "gemini-1.5-flash"
} response code: 200 enabled compression True content-encoding: verifycert: True

And the access from different JSON Libs:

jsn:= TMcJsonItem.Create(); 
jsn.AsJSON:=formatjson(JSONGEMINI_TEST);
writ(jsn.values['candidates'].items[0]['content'].items[0]['text'].items[0].asstring);
    jsno:= jsn.values['candidates'].items[0]['content'].items[0]['text'].asobject;
    writ(StringReplace(jsno['text'].asstring,'\n',CRLF,[rfReplaceAll]));

jsn:= TMcJsonItem.Create(); 
jsn.AsJSON:=formatjson(JSONGEMINI_TEST);

try
   jo:= TJSON.Create(); 
   jo.parse(rest.Post(TStringStream.createstring(format(JPostDat,[askstream])))); 
   result:= jo.values['candidates'].asarray[0].asobject['content'].asobject['parts'].asarray[0].asobject['text'].asstring;
     //writeln(formatjson(jo.stringify));
   finally
     Free;
     jo.Free;
   except  
      writeln('EWI_Exc:'+ExceptiontoString(exceptiontype,exceptionparam));
   end; 

You see candidates and parts are arrays of objects!

Jouef CC 6526 & Lemaco CC 40110 Nice
Roland Jupiter-X
AI Regression Test

Published by maxbox4

Code till the End

4 thoughts on “Google Gemini API

    • generation_config: This is an object that includes settings for how the model generates text, such as the temperature, max_output_tokens, top_p, top_k, etc.
      • temperature: (Float) The temperature value you want to use for text generation (usually between 0 and 2).
      • max_output_tokens: (Integer) The maximum number of tokens to generate in the response. You should set the maximum tokens to not go over the context window limit of the model. You’ll get an error if you try to go over the model context limit.

    Like

  1. For the latest version the stringstream has to be set with writestring:

    sstream:= TStringStream.create(”);

    sstream.writestring(format(JPostDat,[askstream]));

    jo.parse(rest.Post(sstream))

    Like

Leave a comment

Design a site like this with WordPress.com
Get started