Server.Enrichment.AI.Gemini

Query Gemini AI for analysis of data.

Paramaters:

  • PrePrompt - Added as preprompt. Default is: “You are a Cyber Incident Responder and need to analyze data. You have an eye for detail and like to use short precise technical language. Analyze the following data and provide summary analysis:”
  • Prompt - Is User prompt as string: When pushing a dict object via PromtData good practice is add some strings related to the type of data for analysis or artifact name to provide context.
  • PromptData - add optional object to be serialized and added to the User prompt.
  • Model - Model to use for your request. Default is gemini-2.0-flash
  • MaxTokens - Set max token size default 64000

This artifact can be called from within another artifact (such as one looking for files) to enrich the data made available by that artifact.

e.g

LET results = SELECT field1, field2, field3.... FROM source() WHERE ...
SELECT * FROM Artifact.Server.Enrichment.AI.Gemini(Prompt="Review Autoruns data:",PromptData=results)

NOTE: there appears to be a bug in serialize() that is causing some issues in some large collections. If you run into issues try reducing token size and filtering unnecessary data. This is resolved in head 0.74.2 commit:99dba70 and full 1000000 tokens can be used


name: Server.Enrichment.AI.Gemini
author: Matt Green - @mgreen27
description: |
  Query Gemini AI for analysis of data.
  
  Paramaters:
  
  * `PrePrompt` - Added as preprompt. Default is: 
  "You are a Cyber Incident Responder and need to analyze data. You have an eye 
  for detail and like to use short precise technical language. Analyze the 
  following data and provide summary analysis:"
  * `Prompt` - Is User prompt as string: When pushing a dict object via 
  PromtData good practice is add some strings related to the type of data for 
  analysis or artifact name to provide context.
  * `PromptData` - add optional object to be serialized and added to the User prompt.
  * `Model` - Model to use for your request. Default is gemini-2.0-flash
  * `MaxTokens` - Set max token size  default 64000
  
  This artifact can be called from within another artifact (such as one looking 
  for files) to enrich the data made available by that artifact.
  
  e.g
  
  `LET results = SELECT field1, field2, field3.... FROM source() WHERE ...`  
  `SELECT * FROM Artifact.Server.Enrichment.AI.Gemini(Prompt="Review Autoruns data:",PromptData=results)`

  NOTE: there appears to be a bug in serialize() that is causing some issues in some large collections. 
  If you run into issues try reducing token size and filtering unnecessary data. 
  This is resolved in head 0.74.2 commit:99dba70 and full 1000000 tokens can be used
  
type: SERVER

parameters:
    - name: PrePrompt
      type: string
      description: |
        Prompt to send with data. For example, when asking 
        a question, then providing data separately
      default: |
        You are a Cyber Incident responder and need to analyse forensic 
        collections. You have an eye for detail and like to use short precise 
        technical language. Your PRIMARY goal is to analyse the following data 
        and provide summary analysis:
    - name: Prompt
      type: string
      default: what is prefetch?
    - name: PromptData
      type: string
      description: The data sent to Google - this data is serialised and added to the prompt
    - name: Model
      type: string
      description: The model used for processing the prompt
      default: gemini-2.0-flash
    - name: GeminiApiKey
      type: string
      description: Token for Gemini. Leave blank here if using server metadata store.
    - name: MaxTokens
      type: int
      default: 100000

sources:
  - query: |
        LET Creds <= if(
            condition=GeminiApiKey,
            then=GeminiApiKey,
            else=server_metadata().GeminiApiKey)
        LET parts = if(condition=PromptData,
                        then= dict(text=PrePrompt + Prompt + serialize(item=PromptData)),
                        else= dict(text=PrePrompt + Prompt)
                    )
        LET Data = dict(contents=dict(parts=[parts,]))

        SELECT
            parts.text as UserPrompt,
            parse_json(data=Content).candidates[0].content.parts[0].text AS ResponseText,
            parse_json(data=Content) AS ResponseDetails
        FROM http_client(
            url='https://generativelanguage.googleapis.com/v1beta/models/' + Model + ':generateContent?key=' + Creds,
            headers=dict(`Content-Type`="application/json"),
            method="POST",
            data=Data
        )

column_types:
  - name: ResponseText
    type: nobreak