Module ai.googleapis.vertex

sumudunissanka/ai.googleapis.vertex

1.5.1

Overview

This module offers APIs for connecting with models hosted on Google Vertex AI, including Google Gemini models and partner models from Anthropic, Mistral, Meta, DeepSeek, Qwen, Kimi, and MiniMax available through the Vertex AI Model Garden.

Prerequisites

Before using this module in your Ballerina application, you must have a Google Cloud project with Vertex AI enabled.

Create a Google Cloud account and set up a project.
Enable the Vertex AI API for your project.
Set up authentication using one of the supported methods below.

Quickstart

To use the ai.googleapis.vertex module in your Ballerina application, update the .bal file as follows:

Step 1: Import the module

Import the ai.googleapis.vertex module.


import ballerinax/ai.googleapis.vertex;

Step 2: Initialize the Model Provider

Three authentication options are supported:

Option 1 — OAuth2 refresh token (recommended for development; credentials from ~/.config/gcloud/application_default_credentials.json):


import ballerina/ai;
import ballerinax/ai.googleapis.vertex;

final ai:ModelProvider vertexModel = check new vertex:ModelProvider(
    auth = {
        clientId: "your-client-id",
        clientSecret: "your-client-secret",
        refreshToken: "your-refresh-token"
    },
    projectId = "your-gcp-project-id",
    location = "us-central1",
    model = "google/gemini-2.0-flash"
);

Option 2 — Service account JSON key file (recommended for production):


final ai:ModelProvider vertexModel = check new vertex:ModelProvider(
    auth = "/path/to/service-account-key.json",
    projectId = "your-gcp-project-id",
    location = "us-central1",
    model = "google/gemini-2.0-flash"
);

Option 3 — Service account inline credentials (use when you need to override scopes):


final ai:ModelProvider vertexModel = check new vertex:ModelProvider(
    auth = {
        clientEmail: "your-sa@your-project.iam.gserviceaccount.com",
        privateKey: "-----BEGIN RSA PRIVATE KEY-----\n..."
    },
    projectId = "your-gcp-project-id",
    location = "us-central1",
    model = "google/gemini-2.0-flash"
);

Step 3: Invoke chat completion


ai:ChatMessage[] chatMessages = [{role: "user", content: "hi"}];
ai:ChatAssistantMessage response = check vertexModel->chat(chatMessages, tools = []);

chatMessages.push(response);

Step 4: Generate typed output


type Sentiment record {|
    string label;
    decimal score;
|};

@ai:JsonSchema {
    "type": "object",
    "required": ["label", "score"],
    "properties": {
        "label": {"type": "string", "enum": ["positive", "neutral", "negative"]},
        "score": {"type": "number"}
    }
}
type SentimentType Sentiment;

Sentiment|error result = vertexModel->generate(
    `Analyze the sentiment of: "I love this product!"`
);

Step 5: Use an embedding provider


import ballerina/ai;
import ballerinax/ai.googleapis.vertex;

final ai:EmbeddingProvider vertexEmbedding = check new vertex:EmbeddingProvider(
    auth = "/path/to/service-account-key.json",
    projectId = "your-gcp-project-id",
    location = "us-central1"
);

ai:Embedding embedding = check vertexEmbedding->embed(<ai:TextChunk>{content: "Hello, world!"});

Clients

ai.googleapis.vertex: EmbeddingProvider

Isolated

EmbeddingProvider is a client class that provides an interface for generating vector embeddings using Google Vertex AI text embedding models.

Constructor

Initializes the Vertex AI embedding provider with the given configuration.

init (VertexAiAuth auth, string projectId, string location, VertexAiEmbeddingModelNames modelType, string serviceUrl, *ConnectionConfig connectionConfig)

auth VertexAiAuth - Authentication config: OAuth2RefreshConfig for OAuth2 refresh token flow, or ServiceAccountConfig for automatic token refresh via service account

projectId string - The Google Cloud project ID

location string "global" - The Google Cloud region (e.g., "global")

modelType VertexAiEmbeddingModelNames TEXT_EMBEDDING_005 - The embedding model to use

serviceUrl string "" - The base URL of the Vertex AI API endpoint (defaults to the regional URL)

connectionConfig *ConnectionConfig - Additional HTTP connection configuration

embed

Isolated FunctionRemote Function

function embed(Chunk chunk) returns Embedding|Error

Converts the given chunk into a vector embedding.

Parameters

chunk Chunk - The chunk to convert into an embedding (only ai:TextChunk and ai:TextDocument are supported)

Return Type

Embedding|Error - The embedding vector representation on success; ai:LlmConnectionError if the HTTP call fails; ai:LlmInvalidResponseError if the model returns no embeddings; ai:Error if the chunk type is not supported

batchEmbed

Isolated FunctionRemote Function

function batchEmbed(Chunk[] chunks) returns Embedding[]|Error

Converts a batch of chunks into vector embeddings by calling the embed endpoint for each chunk. Chunks are processed sequentially. On the first failure, processing halts and no partial results are returned. Vertex AI does not expose a native batch embedding endpoint for Gemini models, so each chunk requires its own HTTP round trip.

Parameters

chunks Chunk[] - The chunks to convert into embeddings (only ai:TextChunk and ai:TextDocument are supported)

Return Type

Embedding[]|Error - An array of embedding vectors on success; or an error if ANY single chunk fails

ai.googleapis.vertex: ModelProvider

Isolated

ModelProvider is a client class that provides an interface for interacting with models hosted on Google Vertex AI, including Google Gemini models and partner models (Anthropic Claude, Mistral) available through Vertex AI Model Garden.

The model parameter uses "publisher/model-name" format, which determines both the endpoint path and the wire format used for requests:

"google/gemini-2.0-flash" — Vertex AI generateContent API
"anthropic/claude-sonnet-4-6" — Anthropic Messages API via rawPredict
"mistralai/mistral-medium-3" — OpenAI-compatible format via rawPredict
"meta/llama-4-maverick-17b-128e-instruct-maas" — OpenAI-compatible open-models endpoint
"deepseek-ai/deepseek-v3-0324" — OpenAI-compatible open-models endpoint
"qwen/qwen3-235b-a22b" — OpenAI-compatible open-models endpoint
"kimi/kimi-k2" — OpenAI-compatible open-models endpoint
"minimax/minimax-m2" — OpenAI-compatible open-models endpoint

Constructor

Initializes the Vertex AI model provider with the given configuration.

init (VertexAiAuth auth, string projectId, string model, string location, string serviceUrl, int maxTokens, decimal? temperature, *ConnectionConfig connectionConfig)

auth VertexAiAuth - Authentication config: OAuth2RefreshConfig for OAuth2 refresh token flow, or ServiceAccountConfig for automatic token refresh via service account

projectId string - The Google Cloud project ID

model string - The model in "publisher/model-name" format, e.g.: "google/gemini-2.0-flash".

location string "global" - The Google Cloud region (e.g., "global","us-central1")

serviceUrl string "" - The base URL of the Vertex AI API endpoint. Defaults to the regional URL https://{location}-aiplatform.googleapis.com

maxTokens int DEFAULT_MAX_TOKEN_COUNT - The upper limit for the number of tokens in the model's response

temperature decimal? () - Controls randomness in the model's output. Pass () to omit the field entirely (required for models that do not accept it)

connectionConfig *ConnectionConfig - Additional HTTP connection configuration

chat

Isolated FunctionRemote Function

function chat(ChatMessage[]|ChatUserMessage messages, ChatCompletionFunctions[] tools, string? stop) returns ChatAssistantMessage|Error

Sends a chat request to the model. The request is routed to the correct publisher-specific endpoint and serialised using the appropriate wire format.

Parameters

messages ChatMessage[]|ChatUserMessage - List of chat messages or a single user message

tools ChatCompletionFunctions[] (default []) - Tool definitions to be used for tool calling

stop string? (default ()) - Stop sequence to stop the completion

Return Type

ChatAssistantMessage|Error - The assistant's response, or an error if the request fails

generate

Isolated FunctionRemote Function

function generate(Prompt prompt, typedesc<anydata> td) returns td|Error

Sends a prompt to the model and generates a value of the type specified by the td type descriptor. Supports all publishers (Gemini, Anthropic, Mistral).

Parameters

prompt Prompt - The prompt to use

td typedesc<anydata> (default <>) - Type descriptor specifying the expected return type format

Return Type

td|Error - Generates a value that belongs to the type, or an error if generation fails

Enums

ai.googleapis.vertex: VertexAiEmbeddingModelNames

Embedding model names supported by the Vertex AI embedding provider.

Members

TEXT_EMBEDDING_005

TEXT_MULTILINGUAL_EMBEDDING_002

TEXT_EMBEDDING_004

Records

ai.googleapis.vertex: ConnectionConfig

Closed record

Configurations for controlling the behaviours when communicating with a remote HTTP endpoint.

Fields

httpVersion HttpVersion(default http:HTTP_2_0) - The HTTP version understood by the client

http1Settings? ClientHttp1Settings - Configurations related to HTTP/1.x protocol

http2Settings? ClientHttp2Settings - Configurations related to HTTP/2 protocol

timeout decimal(default 60) - The maximum time to wait (in seconds) for a response before closing the connection

forwarded string(default "disable") - The choice of setting forwarded/x-forwarded header

poolConfig? PoolConfiguration - Configurations associated with request pooling

cache? CacheConfig - HTTP caching related configurations

compression Compression(default http:COMPRESSION_AUTO) - Specifies the way of handling compression (accept-encoding) header

circuitBreaker? CircuitBreakerConfig - Configurations associated with the behaviour of the Circuit Breaker

retryConfig? RetryConfig - Configurations associated with retrying

responseLimits? ResponseLimitConfigs - Configurations associated with inbound response size limits

secureSocket? ClientSecureSocket - SSL/TLS-related options

proxy? ProxyConfig - Proxy server related options

validation boolean(default true) - Enables the inbound payload validation functionality which provided by the constraint package. Enabled by default

ai.googleapis.vertex: OAuth2RefreshConfig

Read OnlyClosed record

Google OAuth2 refresh token credentials. The HTTP client exchanges the refresh token for a short-lived access token and renews it transparently before expiry. Get credentials via: gcloud auth application-default login then read ~/.config/gcloud/application_default_credentials.json.

Fields

clientId string - OAuth2 client ID

clientSecret string - OAuth2 client secret

refreshToken string - Long-lived refresh token (does not expire unless revoked)

refreshUrl string(default "https://oauth2.googleapis.com/token") - Token endpoint URL

ai.googleapis.vertex: ServiceAccountConfig

Read OnlyClosed record

Google Cloud Service Account credentials for Vertex AI authentication. A new signed JWT is built and exchanged for a fresh access token automatically, 5 minutes before the current token expires. Works for long-running services.

Fields

clientEmail string - Service account email (client_email field in the JSON key file)

privateKey string - RSA private key in PEM format (private_key field in the JSON key file)

scopes string[](default ["https://www.googleapis.com/auth/cloud-platform"]) - OAuth2 scopes to request

Union types

ai.googleapis.vertex: VertexAiAuth

OAuth2RefreshConfig|ServiceAccountConfig

VertexAiAuth

Authentication configuration for Vertex AI.

OAuth2RefreshConfig — OAuth2 refresh token flow. HTTP client auto-refreshes forever.
ServiceAccountConfig — Service account JWT Bearer with inline credentials. Tokens re-signed and exchanged automatically before expiry.
ServiceAccountJsonFilePath — Path to a Google Cloud service account JSON key file. Use ServiceAccountConfig instead if you need to override scopes.

Import

import sumudunissanka/ai.googleapis.vertex;

Other versions

1.5.1

1.1.2 1.0.14 1.0.12 1.0.11

Metadata

Released date: 14 days ago

Version: 1.5.1

License: Apache-2.0

Compatibility

Platform: java21

Ballerina version: 2201.12.6

GraalVM compatible: Yes

Pull count

Total: 24

Current verison: 3

Weekly downloads

Source repository

Keywords

Agent

Model Provider

Vendor/Google

Area/AI

Contributors

Dependencies

ballerina/constraint/1.7.0 ballerina/oauth2/2.14.1 ballerina/jwt/2.15.1

Cookie policy

Delete policy

clients

enums

records

unionTypes

sumudunissanka/ai.googleapis.vertex

Overview

Prerequisites

Quickstart

Step 1: Import the module

Step 2: Initialize the Model Provider

Step 3: Invoke chat completion

Step 4: Generate typed output

Step 5: Use an embedding provider

Clients

ai.googleapis.vertex: EmbeddingProvider

Constructor

embed

Parameters

Return Type

batchEmbed

Parameters

Return Type

ai.googleapis.vertex: ModelProvider

Constructor

chat

Parameters

Return Type

generate

Parameters

Return Type

Enums

ai.googleapis.vertex: VertexAiEmbeddingModelNames

Members

Records

ai.googleapis.vertex: ConnectionConfig

Fields

ai.googleapis.vertex: OAuth2RefreshConfig

Fields

ai.googleapis.vertex: ServiceAccountConfig

Fields

Union types

ai.googleapis.vertex: VertexAiAuth