nadheeshjihan/ai.agent

0.3.1

Overview

This module provides the functionality required to build ReAct agent using Large Language Models (LLMs).

Prerequisites

Before using this module in your Ballerina application, complete the following:

Create an OpenAI account.
Obtain an API key by following these instructions.

Alternatively, it is possible to use an Azure OpenAI account by completing the following steps.

Create an Azure account.
Create an Azure OpenAI resource.
Obtain the tokens. Refer to the Azure OpenAI Authentication guide to learn how to generate and use tokens.

Tool

A tool refers to a single action used to retrieve, process, or manipulate data. It can be a function or an API call, which may require certain inputs following a specific input schema.

Function as a tool

When using a Ballerina function as a tool, the function should adhere to the following template:


isolated function functionName(record parameters) returns anydata|error {
    // function body 
}

In this template, record parameters represents a Ballerina record that contains the input parameters for the function. If the function doesn't require any inputs, it can be defined without any parameters. The function has the flexibility to return any data type or an error. It is important to note that the function needs to be an isolated function to ensure concurrency safety.

To define a tool using the above function, you can use the following syntax:


agent:Tool exampleTool = {
    name: "exampleTool", // used as an identifier 
    description: "defines the purpose of the function", // provides information about the behavior
    inputSchema: {
        // a JSON schema that defines the inputs to the function (if applicable)
    },
    caller: functionName // a pointer to the function
}

HTTP resource as a Tool

To use an API resource as a tool, an HTTP tool definition can be created as follows.


agent:HttpTool httpResourceTool = {
    name: "exampleTool", // used as an identifier 
    description: "defines the purpose of the API resource", // provides information about the behavior
    path: "/path/resourceA/" // path to the resource
    method: "get" // the HTTP request method (e.g., GET, POST, DELETE, PUT, etc.)
    queryParams: {
        // a JSON schema defining the query parameters of the HTTP resource
    }
    pathParams: {
        // a JSON schema defining path parameters of the HTTP resource
    }
    requestBody: {
        // a JSON schema defining the request body of the HTTP resource
    }
}

Tools from Interface Definition Languages (IDLs)

You can automatically extract tools from a valid OpenAPI specification (3.x) using the extractToolsFromOpenApiSpec function, as demonstrated below:


string openApiPath = "<PATH TO THE JSON FILE>"
agent:HttpTool[] tools = extractToolsFromOpenApiSpec(openApiPath)

Tool Input Schema

The tool utilizes a JSON schema to define the input schema. This schema specifies the expected structure of the Ballerina record required by the Ballerina function, as well as the parameters (query/path) and payload for an HTTP API call.

For example, the input schema for a Ballerina record can be defined as follows:

Ballerina record:


type SendEmailInput record {|
    string recipient = "<DEFAULT EMAIL>"; // should be an email address from the contacts
    string subject;
    string messageBody;
    string contentType?;
|};

JSON input schema:


agent:InputSchema schema = {
        'type: agent:OBJECT,
        properties: {
            recipient: {'type: agent:STRING, description: "should be an email address from the contacts", default: "<DEFAULT EMAIL>"},
            subject: {'type: agent:STRING},
            messageBody: {'type: agent:STRING},
            contentType: {'const: "text/plain"} // constant value 
        }
}

ToolKit

A Toolkit is a highly valuable asset when it comes to organizing a collection of tools that share common attributes. Not only does it provide organization, but it also offers the flexibility to extend and define new types of tools.

To illustrate this point, let's consider an HTTP service that encompasses multiple resources. Typically, these resources share the same service URL and client configurations. In such cases, utilizing an HttpServiceToolKit allows for the convenient grouping of all the HttpTools associated with the resources of that specific service.

Furthermore, the HttpServiceToolKit extends the definition of a Tool to encompass HttpTool specifics, effectively encapsulating HTTP-related details. By interpreting an HttpTool as a Tool, the HttpServiceToolKit eliminates the need for additional effort in writing separate Tools for HTTP services. This streamlined interpretation simplifies the development process and saves valuable time.


agent:HttpTool resource1 = {
    // defines resource 1
}

....

agent:HttpTool resourceN = {
    // defines resource N
}

agent:HttpServiceToolKit serviceAToolKit = check new (
    serviceUrl, 
    [resource1,...,resourceN], 
    httpClientConfigs, 
    httpHeaders
);

Model

This is a large language model (LLM) instance. Currently, the agent module has support for the following LLM APIs.

OpenAI GPT3


agent:Gpt3Model model = check new ({auth: {token: <OPENAI API KEY>}});

OpenAI ChatGPT (e.g. GPT3.5, GPT4)


agent:ChatGptModel model = check new ({auth: {token: <OPENAI API KEY>}});

Azure OpenAI GPT3


agent:AzureGpt3Model model = check new ({auth: {token: <AZURE OPENAI API KEY>}}, string serviceUrl, string deploymentId, string apiVersion);

Agent

The agent facilitates the execution of natural language (NL) commands by leveraging the reasoning and text generation capabilities of LLMs (Language Models). It follows the ReAct framework:

To create an agent, you need an LLM model and a set of Tool (or ToolKit) definitions.


agent:Agent agent = check new (LLMModel model, (ToolKit|Tool)... tools);

There are multiple ways to utilize the agent.

Agent.run() for batch execution

The agent can be executed without interruptions using Agent.run(). It attempts to fully execute the given NL command and returns the results at each step.


agent:ExecutionStep[] execution = agent.run("<NL COMMAND>", maxIter = 10);

AgentIterator for foreach execution

The agent can also act as an iterator, providing reasoning and output from the tool at each step while executing the command.


agent:AgentIterator agentIterator = agent.getIterator("<NL COMMAND>");
foreach agent:ExecutionStep|error step in agentIterator{
    // logic goes here
    // can decide whether to continue/rollback/exit the loop based on the observation from the tool
}

AgentExecutor for streaming execution

The agent can be executed as a stream using AgentExecutor. This allows more flexibility in controlling the agent's execution. AgentExecutor can take previous execution steps as input to resume a task that was partially executed.

This approach is useful in scenarios where you need to remove specific steps during execution (e.g., unsuccessful or older steps). It also allows for manual execution in certain cases (e.g., handling specific errors or obtaining user inputs). You can manipulate the execution trace as required using AgentExecutor.


string QUERY = "<NL COMMAND>";
agent:AgentExecutor agentExecutor = agent.getExecutor(QUERY);
agent:ExecutionStep[] trace = [];
while(agentExecutor.hasNext()){
    agent:ExecutionStep step = check agentExecutor.nextStep();
    any|error observation = step?.observation;
    if observation is error {
        // handle the error using a tool
        // push manually to trace the execution steps of the manually performed actions if needed
        // clean the traces if needed
        agentExecutor = agent.getExecutor(QUERY, trace);
        continue;
    }
    trace.push(step);
}

Quickstart

Let's walk through the usage of the ai.agent library using this sample. The example demonstrates the use of two types of tools:

To send a Google email, we utilize the sendMessage function from the ballerinax/googleapis.gmail connector as a tool.
HttpTools are used to create and list WiFi accounts through the GuestWiFi HTTP service.
- List available WiFi accounts:GET /guest-wifi-accounts/{ownerEmail}
- Create a new WiFi account: POST /guest-wifi-accounts

Follow the steps below to create a simple sample:

Step 1 - Import library

import ballerinax/ai.agent;
import ballerinax/googleapis.gmail;

Step 2 - Preparation Gmail `gmail->sendMessage` tool as a function (optional)

First, we need to wrap the connector actions using another function since Ballerina doesn't allow invoking remote function pointers directly. Here, we create the sendEmail function that wraps the connector action.


isolated function sendEmail(gmail:MessageRequest messageRequest) returns string|error {
    gmail:Client gmail = check new ({auth: {token: gmailToken}});
    gmail:Message|error sendMessage = gmail->sendMessage(messageRequest);
    if sendMessage is gmail:Message {
        return sendMessage.toString();
    }
    return "Error while sending the email" + sendMessage.message();
}

Step 3 - Defining Tools for the Agent

First, define sendMail function as a tool.


agent:Tool sendEmailTool = {
    name: "Send mail",
    description: "useful to send emails to a given recipient",
    inputSchema: {
        properties: {
            recipient: {'type: agent:STRING},
            subject: {'type: agent:STRING},
            messageBody: {'type: agent:STRING},
            contentType: {'const: "text/plain"}
        }
    },
    caller: sendMail
};

Next, create HttpTools for the resources of the GuestWiFi HTTP service. Then use HttpServiceToolKit to create a toolkit for that HTTP service.


agent:HttpTool listWifiHttpTool = {
    name: "List wifi",
    path: "/guest-wifi-accounts/{ownerEmail}",
    method: agent:GET,
    description: "useful to list the guest wifi accounts."
};

agent:HttpTool createWifiHttpTool = {
    name: "Create wifi",
    path: "/guest-wifi-accounts",
    method: agent:POST,
    description: "useful to create a guest wifi account.",
    requestBody: {
        'type: agent:OBJECT,
        properties: {
            email: {'type: agent:STRING},
            username: {'type: agent:STRING},
            password: {'type: agent:STRING}
        }
    }
};   

agent:HttpServiceToolKit wifiServiceToolKit = check new (wifiServiceUrl, [listWifiHttpTool, createWifiHttpTool], {
    auth: {
        tokenUrl: wifiServiceTokenUrl,
        clientId: wifiServiceClientId,
        clientSecret: wifiServiceClientSecret
    }
});

Note that when creating the HttpServiceToolKit for the GuestWiFi service, we provide the service URL and authentication configurations to the HttpServiceToolKit initializer to establish the connection with the service.

Step 4 - Create the Agent

To create the agent, we first need to initialize a model (e.g., GPT3, GPT4). In this example, we initialize the agent with the ChatGptModel model as follows:


agent:ChatGptModel model = check new ({auth: {token:  <OPENAI API KEY>}});
agent:Agent agent = check new (model, wifiServiceToolKit, sendEmailTool);

Step 5 - Run the agent

Now we can run the agent with NL commands from the user. Note that in this case, we use a query template and pass unknowns as interpolations to the queryTemplate.


string queryTemplate = string`create a new guest WiFi account for email ${wifiOwnerEmail} with user ${wifiUsername} and password ${wifiPassword}. Send the available list of WiFi accounts for that email to ${recipientEmail}`;

agent:ExecutionStep[] run = agent.run(query);

Output

Let's examine the output produced by the above example. Assuming the following natural language (NL) command is given to the agent:

NL Command: create a new guest WiFi account for email johnny@gmail.com with user guest123 and password john123. Send the available list of WiFi accounts for that email to alexa@gmail.com

The agent will proceed with multiple reasoning-action iterations following the ReAct Framework.

Agent creates a new WiFi account for owner johnny@gmail.com:

 Reasoning iteration: 1
 Thought: We need to create a new guest WiFi account with the given username and password, and then list the available WiFi accounts for the email owner and send it to a specified recipient. 
 Action: 
 ```
 {
 "tool": "Create wifi",
 "tool_input": {
     "requestBody": {
     "email": "johnny@wso2.com",
     "username": "guest123",
     "password": "john123"
     }
 }
 }
 ````
 Observation: Successfully added the wifi account

Agent finds existing guest WiFi accounts under the owner johnny@gmail.com:

 Reasoning iteration: 2
 Thought: Now we need to list the available WiFi accounts for the email owner and send it to a specified recipient.
 Action:
 ```
 {
 "tool": "List wifi",
 "tool_input": {
     "path": "/guest-wifi-accounts/johnny@wso2.com"
 }
 }
 ```

Observation: ["guest123.guestOf.johnny","newGuest.guestOf.johnny"]

Agent sends an email to alexa@gmail.com with the information about the existing accounts:

In this step, the agent is responsible for generating the email subject and message body as well. The user provides only the recipient's email.

 Reasoning iteration: 3
 Thought: Finally, we need to send the available wifi list to the specified recipient.
 Action:
 ```
 {
 "tool": "Send mail",
 "tool_input": {
     "recipient": "alexa@wso2.com",
     "subject": "Available Wifi List",
     "messageBody": "The available wifi accounts for johnny@wso2.com are: guest123.guestOf.johnny, newGuest.guestOf.johnny"
 }
 }
 ```
 Observation: {"threadId":"1884d1bda3d2c286","id":"1884d1bda3d2c286","labelIds":["SENT"]}

Agent concludes the task:

Reasoning iteration: 4
Thought: I now know the final answer
Final Answer: Successfully created a new guest wifi account with username "guest123" and password "john123" for the email owner "johnny@wso2.com". The available wifi accounts for "johnny@wso2.com" are "guest123.guestOf.johnny" and "newGuest.guestOf.johnny", and this list has been sent to the specified recipient "alexa@wso2.com".

As a result, alexa@gmail.com will receive an email generated by the agent with the subject "Available WiFi List" and the message body "The available WiFi accounts for johnny@wso2.com are: guest123.guestOf.johnny, newGuest.guestOf.johnny".

Functions

extractToolsFromOpenApiSpec

function extractToolsFromOpenApiSpec(string filePath, *AdditionInfoFlags additionInfoFlags) returns HttpApiSpecification & readonly|error

Extracts the Http tools from the given OpenAPI specification file.

Parameters

filePath string - Path to the OpenAPI specification file

additionInfoFlags *AdditionInfoFlags - Flags to extract additional information from the OpenAPI specification

Return Type

HttpApiSpecification & readonly|error - HttpApiSpecification record with the extracted tools

Classes

ai.agent: Agent

Isolated

Agent implementation to perform tools with LLMs to add computational power and knowledge to the LLMs.

Constructor

Initialize an Agent.

init (LlmModel model, (BaseToolKit|Tool)... tools)

model LlmModel - LLM model instance

tools (BaseToolKit|Tool)... -

getExecutor

Isolated Function

function getExecutor(string query, ExecutionStep[] previousSteps, string|map<json> context) returns AgentExecutor

Initialize the agent executor for a given query. Agent executor is useful for streaming-like execution of the agent.

Parameters

query string - User's query

previousSteps ExecutionStep[] (default []) - Execution steps perviously taken by the agent for the query given

context string|map<json> (default {}) - Context information to be used by the LLM

Return Type

AgentExecutor - AgentExecutor instance

getIterator

Isolated Function

function getIterator(string query, string|map<json> context) returns AgentIterator

Initialize the agent iterator for a given query. Agent executor is useful for foreach execution of the agent.

Parameters

query string - User's query

context string|map<json> (default {}) - Context information to be used by the LLM

Return Type

AgentIterator - AgentIterator instance

run

Isolated Function

function run(string query, int maxIter, string|map<json> context, boolean verbose) returns ExecutionStep[]

Execute the agent for a given user's query.

Parameters

query string - Natural langauge commands to the agent

maxIter int (default 5) - No. of max iterations that agent will run to execute the task

context string|map<json> (default {}) - Context values to be used by the agent to execute the task

verbose boolean (default true) - If true, then print the reasoning steps

Return Type

ExecutionStep[] - Returns the execution steps tracing the agent's reasoning and outputs from the tools

ai.agent: AgentExecutor

hasNext

Isolated Function

function hasNext() returns boolean

Checks whether agent has more steps to execute.

Return Type

boolean - True if agent has more steps to execute, false otherwise

reason

Isolated Function

function reason() returns string|error

Reason the next step of the agent.

Return Type

string|error - Thought to be executed by the agent or an error if the reasoning failed

act

Isolated Function

function act(string thought) returns any|error

Execute the next step of the agent.

Parameters

thought string - Thought to be executed by the agent

Return Type

any|error - Observations from the tool can be any|error|null

update

Isolated Function

function update(ExecutionStep step)

Update the agent with the latest exectuion step.

Parameters

step ExecutionStep - Latest step to be added to the history

Isolated Function

function next() returns record {| value ExecutionStep|error |}?

Execute the next step of the agent.

Return Type

record {| value ExecutionStep|error |}? - A record with ExecutionStep or error

ai.agent: AgentIterator

iterator

function iterator() returns object {
        public function next() returns record {|ExecutionStep|error value;|}?;
    }

ai.agent: AzureGpt3Model

Isolated

Constructor

Initializes the GPT-3 model with the given connection configuration and model configuration.

init (ConnectionConfig connectionConfig, string serviceUrl, string deploymentId, string apiVersion, AzureGpt3ModelConfig modelConfig)

connectionConfig ConnectionConfig - Connection Configuration for Azure OpenAI text client

serviceUrl string - Service URL for Azure OpenAI service

deploymentId string - Deployment ID for Azure OpenAI model instance

apiVersion string - API version for Azure OpenAI model instance

modelConfig AzureGpt3ModelConfig {} - Model Configuration for Azure OpenAI text client

generate

Isolated Function

function generate(PromptConstruct prompt) returns string|error

Method included from *LlmModel

complete

Isolated Function

function complete(string prompt, string? stop) returns string|error

Completes the given prompt using the GPT3 model.

Parameters

prompt string - Prompt to be completed

stop string? (default ()) - Stop sequence to stop the completion

Return Type

string|error - Completed prompt or error if the completion fails

ai.agent: ChatGptModel

Isolated

Constructor

Initializes the ChatGPT model with the given connection configuration and model configuration.

init (ConnectionConfig connectionConfig, ChatGptModelConfig modelConfig)

connectionConfig ConnectionConfig - Connection Configuration for OpenAI chat client

modelConfig ChatGptModelConfig {} - Model Configuration for OpenAI chat client

generate

Isolated Function

function generate(PromptConstruct prompt) returns string|error

Method included from *LlmModel

chatComplete

Isolated Function

function chatComplete(ChatCompletionRequestMessage[] messages, string? stop) returns string|error

Completes the given prompt using the ChatGPT model.

Parameters

messages ChatCompletionRequestMessage[] - Messages to be completed

stop string? (default ()) - Stop sequence to stop the completion

Return Type

string|error - Completed message or error if the completion fails

ai.agent: Gpt3Model

Isolated

Constructor

Initializes the GPT-3 model with the given connection configuration and model configuration.

init (ConnectionConfig connectionConfig, Gpt3ModelConfig modelConfig)

connectionConfig ConnectionConfig - Connection Configuration for OpenAI text client

modelConfig Gpt3ModelConfig {} - Model Configuration for OpenAI text client

generate

Isolated Function

function generate(PromptConstruct prompt) returns string|error

Method included from *LlmModel

complete

Isolated Function

function complete(string prompt, string? stop) returns string|error

Completes the given prompt using the GPT3 model.

Parameters

prompt string - Prompt to be completed

stop string? (default ()) - Stop sequence to stop the completion

Return Type

string|error - Completed prompt or error if the completion fails

ai.agent: HttpServiceToolKit

Isolated

Constructor

Initializes the toolkit with the given service url and http tools.

init (string serviceUrl, HttpTool[] httpTools, ClientConfiguration clientConfig, HttpHeader headers)

serviceUrl string - The url of the service to be called

httpTools HttpTool[] - The http tools to be initialized

clientConfig ClientConfiguration {} - The http client configuration associated to the tools

headers HttpHeader {} - The http headers to be used in the requests

getTools

Isolated Function

function getTools() returns Tool[]|error

Method included from *BaseToolKit

Enums

ai.agent: HttpMethod

Members

GET

POST

DELETE

PUT

PATCH

HEAD

OPTIONS

ai.agent: InputType

Members

STRING

INTEGER

FLOAT

BOOLEAN

NUMBER

OBJECT

ARRAY

Records

ai.agent: AdditionInfoFlags

Closed record

Fields

extractDescription boolean(default false) -

extractDefault boolean(default false) -

ai.agent: AllOfInputSchema

Closed record

Fields

allOf ObjectInputSchema[] -

ai.agent: AnyOfInputSchema

Closed record

Fields

anyOf ObjectInputSchema[] -

ai.agent: ArrayInputSchema

Closed record

Fields

Fields Included from *BaseInputTypeSchema

type InputType
description string
default json

'type ARRAY(default ARRAY) -

items JsonSubSchema -

default json[]? -

ai.agent: AzureGpt3ModelConfig

Read OnlyClosed record

Fields

model string(default GPT3_MODEL_NAME) -

temperature decimal(default DEFAULT_TEMPERATURE) -

max_tokens int(default DEFAULT_MAX_TOKEN_COUNT) -

stop never? -

prompt never? -

ai.agent: BaseInputTypeSchema

Closed record

Fields

'type InputType -

description string? -

default json? -

ai.agent: ChatGptModelConfig

Read OnlyClosed record

Fields

model string(default GPT3_5_MODEL_NAME) -

temperature decimal(default DEFAULT_TEMPERATURE) -

messages never? -

stop never? -

ai.agent: ConstantValueSchema

Closed record

Fields

'const json -

ai.agent: ExecutionStep

Closed record

Fields

thought string -

observation any|error? -

ai.agent: Gpt3ModelConfig

Read OnlyClosed record

Fields

model string(default GPT3_MODEL_NAME) -

temperature decimal(default DEFAULT_TEMPERATURE) -

max_tokens int(default DEFAULT_MAX_TOKEN_COUNT) -

stop never? -

prompt never? -

ai.agent: HttpApiSpecification

Closed record

Fields

serviceUrl string? -

tools HttpTool[] -

ai.agent: HttpHeader

Read Only

Fields

string|string[]... - Rest field

ai.agent: HttpTool

Closed record

Fields

name string -

description string -

method HttpMethod -

path string -

queryParams JsonInputSchema? -

pathParams JsonInputSchema? -

requestBody JsonInputSchema? -

ai.agent: NotInputSchema

Closed record

Fields

not JsonSubSchema -

ai.agent: ObjectInputSchema

Closed record

Fields

Fields Included from *BaseInputTypeSchema

type InputType
description string
default json

'type OBJECT(default OBJECT) -

required string[]? -

properties map<JsonSubSchema> -

ai.agent: OneOfInputSchema

Closed record

Fields

oneOf JsonSubSchema[] -

ai.agent: PrimitiveInputSchema

Closed record

Fields

Fields Included from *BaseInputTypeSchema

type InputType
description string
default json

'type STRING|INTEGER|NUMBER|FLOAT|BOOLEAN -

format string? -

pattern string? -

'enum string[]? -

ai.agent: PromptConstruct

Closed record

Fields

instruction string -

query string -

history ExecutionStep[] -

ai.agent: SimpleInputSchema

Fields

'type never? -

string|SimpleInputSchema|SimpleInputSchema[]... - Rest field

ai.agent: Tool

Closed record

Fields

name string -

description string -

inputSchema JsonInputSchema?(default ()) -

caller function() () -

Object types

ai.agent: BaseToolKit

Distinct

allows implmenting custom toolkits by extending this type

ai.agent: LlmModel

Distinct

Extendable LLM model object that can be used for completion tasks. Useful to initialize the agents.

Union types

ai.agent: JsonInputSchema

JsonInputSchema

ai.agent: JsonSubSchema

JsonInputSchema|PrimitiveInputSchema|ConstantValueSchema

JsonSubSchema

Import

import nadheeshjihan/ai.agent;

Other versions

0.3.1

0.3.0 0.1.0

Metadata

Released date: over 2 years ago

Version: 0.3.1

Compatibility

Platform: any

Ballerina version: 2201.5.0

GraalVM compatible: Yes

Pull count

Total: 23

Current verison: 16

Weekly downloads

Dependencies

ballerina/log/2.7.0 ballerina/http/2.7.0 ballerinax/openai.chat/1.0.4

Cookie policy

Delete policy

functions

classes

enums

records

objectTypes

unionTypes

nadheeshjihan/ai.agent

Overview

Prerequisites

Tool

Function as a tool

HTTP resource as a Tool

Tools from Interface Definition Languages (IDLs)

Tool Input Schema

ToolKit

Model

Agent

Agent.run() for batch execution

AgentIterator for foreach execution

AgentExecutor for streaming execution

Quickstart

Step 1 - Import library

Step 2 - Preparation Gmail gmail->sendMessage tool as a function (optional)

Step 3 - Defining Tools for the Agent

Step 4 - Create the Agent

Step 5 - Run the agent

Output

Functions

extractToolsFromOpenApiSpec

Parameters

Return Type

Classes

ai.agent: Agent

Constructor

getExecutor

Parameters

Return Type

getIterator

Parameters

Return Type

run

Parameters

Return Type

ai.agent: AgentExecutor

hasNext

Return Type

reason

Return Type

act

Parameters

Return Type

update

Parameters

next

Return Type

ai.agent: AgentIterator

iterator

ai.agent: AzureGpt3Model

Constructor

generate

complete

Parameters

Return Type

ai.agent: ChatGptModel

Constructor

generate

chatComplete

Parameters

Return Type

ai.agent: Gpt3Model

Constructor

generate

complete

Parameters

Return Type

ai.agent: HttpServiceToolKit

Constructor

getTools

Enums

ai.agent: HttpMethod

Step 2 - Preparation Gmail `gmail->sendMessage` tool as a function (optional)