API reference

Introduction

You can interact with the API trough HTTP requests from any coding language like for example Python, JavaScript or PHP.

Authentication

The BOMML API uses API bearer tokens for authentication. Visit your API Tokens page to create and retrieve the API token you will use for authenticating your requests.

Please Remember to always keep your API tokens secure and secret! Do not share them with any unauthorized parties or expose them publicly in any client-side code (browser,apps,websites). Store them securely in your own backend which does the requests, and you can pass your API tokens securely for example from your environment variables or secret storage.

All API requests must include your API token in an `Authorization` HTTP header as follows:

Authorization: Bearer BOMML_API_TOKEN

Making requests

You can paste the command below into your terminal to run your first API request using CURL. Make sure to replace `$BOMML_API_TOKEN` with your own secret API token.

curl https://api.bomml.ai/api/v1/completions/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $BOMML_API_TOKEN" \
-d '{
    "model": "meta-llama/Llama-2-70b-chat-hf",
    "messages": [{"role": "user", "content": "Write me a Haiku"}],
    "max_tokens": 256,
    "temperature": 0.6
}'

This API request queries the `meta-llama/Llama-2-70b-chat-hf` model (for more models please see our models endpoint) to complete or follow the given text prompt/instructions. You should get the following respons back:

{
	"id": "cmpl-187c6927307b44f9a4fca738b6071af0",
	"object": "text_completion",
	"created": 1694297703,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sure! Here is a haiku:\n\nSnowflakes gently fall\nBlanketing the landscape white\nWinter's peaceful hush",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 26,
		"total_tokens": 61,
		"completion_tokens": 35
	}
}

Chat

Given a list of messages compiling a conversation with the AI and the user, the model will return a repose based on the chain of messages.

The chat completion object

Represents a chat completion response returned by model, based on the provided input.

string

A unique identifier for the chat completion.

object

string

The object type, which is always text_completion

created

integer

The Unix timestamp (in seconds) of when the chat completion was created.

model

string

The model used for the chat completion.

choices

integer

A list of chat completion choices.

usage

integer

Usage statistics for the chat completion request.

{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Create chat completion

POST https://api.bomml.ai/api/v1/completions/chat

Creates a model response for the given chat conversation.

Request body

model

string

ID of the model to use. See the model endpoint compatibility table for details on which models work with the Chat API.

messages

array

A list of messages comprising the conversation so far.

temperature

number or null

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

top_p

number or null

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

stream

boolean or null

Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

stop

string / array / null

Up to 4 sequences where the API will stop generating further tokens.

max_tokens

integer or null

The maximum number of tokens to generate in the chat completion.

presence_penalty

number or null

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

frequency_penalty

number or null

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Returns

Returns a chat completion object, or a streamed sequence of chat completion chunk objects if the request is streamed.

Example request

curl https://api.bomml.ai/api/v1/completions/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $BOMML_API_TOKEN" \
  -d '{
    "model": "meta-llama\/Llama-2-70b-chat-hf",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Write me a Haiku"
      }
    ]
  }'

Response

{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Completions

Given a prompt, the model will return one or more predicted completions, this can as well be used with as instructions to execute tasks.

The completion object

Represents a completion response from the API based on the provided input.

string

A unique identifier for the completion.

object

string

The object type, which is always text_completion

created

integer

The Unix timestamp (in seconds) of when the chat completion was created.

model

string

The model used for the completion.

choices

integer

The list of completion choices the model generated for the input prompt.

usage

integer

Usage statistics for the completion request.

{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Create completion

POST https://api.bomml.ai/api/v1/completions

Creates a completion for the provided prompt and parameters.

Request body

model

string

ID of the model to use. See the model endpoint compatibility table for details on which models work with the Chat API.

prompt

array

The prompt for the AI to complete or use as instruction for the text generation.

temperature

number or null

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

top_p

number or null

stream

boolean or null

Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

stop

string / array / null

Up to 4 sequences where the API will stop generating further tokens.

max_tokens

integer or null

The maximum number of tokens to generate in the chat completion.

presence_penalty

number or null

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

frequency_penalty

number or null

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Returns

Returns a completion object, or a streamed sequence of completion chunk objects if the request is streamed.

Example request

curl https://api.bomml.ai/api/v1/completions/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $BOMML_API_TOKEN" \
  -d '{
    "model": "meta-llama\/Llama-2-70b-chat-hf",
    "prompt": "Write me a Haiku"
  }'

Response

{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Models

List and describe the various AI models available the BOMML API. For additional information please check the Models documentation to understand what each model is capable of and the differences between them.

The model object

Describes an BOMML AI model that is available via the API.

string

The unique model identifier.

name

string

The name of the model.

key

string

The unique model key which can be referenced via API requests.

description

string

A short text description about the model.

created_at

integer

The Unix timestamp (in seconds) when the model was created/added.

updated_at

integer

The Unix timestamp (in seconds) when the model was updated/modified.

{
	"id": 1,
	"name": "Llama 2 70B",
	"key": "meta-llama\/Llama-2-70b-chat-hf",
	"description": null,
	"created_at": null,
	"updated_at": null
}

List models

GET https://api.bomml.ai/api/v1/models

Lists the currently available models, and provides basic information about each one such as the description, tasks etc.

Returns

An array of th model objects.

[{
	"id": 1,
	"name": "Llama 2 70B",
	"key": "meta-llama\/Llama-2-70b-chat-hf",
	"description": null,
	"created_at": null,
	"updated_at": null
}, {
	"id": 2,
	"name": "Llama-2-13b-chat-hf",
	"key": "meta-llama\/Llama-2-13b-chat-hf",
	"description": null,
	"created_at": null,
	"updated_at": null
}]