Add creditsTop up your account balance to use our API 

API reference

Introduction

You can interact with the API trough HTTP requests from any coding language like for example Python, JavaScript or PHP.

Authentication

The BOMML API uses API bearer tokens for authentication. Visit your API Tokens page to create and retrieve the API token you will use for authenticating your requests.

Please Remember to always keep your API tokens secure and secret! Do not share them with any unauthorized parties or expose them publicly in any client-side code (browser,apps,websites). Store them securely in your own backend which does the requests, and you can pass your API tokens securely for example from your environment variables or secret storage.

All API requests must include your API token in an `Authorization` HTTP header as follows:

Authorization: Bearer BOMML_API_TOKEN

Making requests

You can paste the command below into your terminal to run your first API request using CURL. Make sure to replace `$BOMML_API_TOKEN` with your own secret API token.

curl https://api.bomml.ai/api/v1/completions/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $BOMML_API_TOKEN" \
-d '{
    "model": "meta-llama/Llama-2-70b-chat-hf",
    "messages": [{"role": "user", "content": "Write me a Haiku"}],
    "max_tokens": 256,
    "temperature": 0.6
}'

This API request queries the `meta-llama/Llama-2-70b-chat-hf` model (for more models please see our models endpoint) to complete or follow the given text prompt/instructions. You should get the following respons back:

{
	"id": "cmpl-187c6927307b44f9a4fca738b6071af0",
	"object": "text_completion",
	"created": 1694297703,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sure! Here is a haiku:\n\nSnowflakes gently fall\nBlanketing the landscape white\nWinter's peaceful hush",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 26,
		"total_tokens": 61,
		"completion_tokens": 35
	}
}

Chat

Given a list of messages compiling a conversation with the AI and the user, the model will return a repose based on the chain of messages.

The chat completion object

Represents a chat completion response returned by model, based on the provided input.

id
string
A unique identifier for the chat completion.

object
string
The object type, which is always text_completion

created
integer
The Unix timestamp (in seconds) of when the chat completion was created.

model
string
The model used for the chat completion.

choices
integer
A list of chat completion choices.

usage
integer
Usage statistics for the chat completion request.
{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Create chat completion

POST https://api.bomml.ai/api/v1/completions/chat
Creates a model response for the given chat conversation.
Request body

model
string
ID of the model to use. See the model endpoint compatibility table for details on which models work with the Chat API.

messages
array
A list of messages comprising the conversation so far.

temperature
number or null
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

top_p
number or null
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

stream
boolean or null
Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

stop
string / array / null
Up to 4 sequences where the API will stop generating further tokens.

max_tokens
integer or null
The maximum number of tokens to generate in the chat completion.

presence_penalty
number or null
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

frequency_penalty
number or null
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Returns

Returns a chat completion object, or a streamed sequence of chat completion chunk objects if the request is streamed.
Example request
curl https://api.bomml.ai/api/v1/completions/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $BOMML_API_TOKEN" \
  -d '{
    "model": "meta-llama\/Llama-2-70b-chat-hf",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Write me a Haiku"
      }
    ]
  }'
Response
{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Completions

Given a prompt, the model will return one or more predicted completions, this can as well be used with as instructions to execute tasks.

The completion object

Represents a completion response from the API based on the provided input.

id
string
A unique identifier for the completion.

object
string
The object type, which is always text_completion

created
integer
The Unix timestamp (in seconds) of when the chat completion was created.

model
string
The model used for the completion.

choices
integer
The list of completion choices the model generated for the input prompt.

usage
integer
Usage statistics for the completion request.
{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Create completion

POST https://api.bomml.ai/api/v1/completions
Creates a completion for the provided prompt and parameters.
Request body

model
string
ID of the model to use. See the model endpoint compatibility table for details on which models work with the Chat API.

prompt
array
The prompt for the AI to complete or use as instruction for the text generation.

temperature
number or null
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

top_p
number or null
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

stream
boolean or null
Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.

stop
string / array / null
Up to 4 sequences where the API will stop generating further tokens.

max_tokens
integer or null
The maximum number of tokens to generate in the chat completion.

presence_penalty
number or null
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

frequency_penalty
number or null
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Returns

Returns a completion object, or a streamed sequence of completion chunk objects if the request is streamed.
Example request
curl https://api.bomml.ai/api/v1/completions/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $BOMML_API_TOKEN" \
  -d '{
    "model": "meta-llama\/Llama-2-70b-chat-hf",
    "prompt": "Write me a Haiku"
  }'
Response
{
	"id": "cmpl-a8624144abe14943a4146b9dfecf2f76",
	"object": "text_completion",
	"created": 1694305075,
	"model": "meta-llama\/Llama-2-70b-chat-hf",
	"choices": [{
		"index": 0,
		"text": "Sun sets slowly down\nGolden hues upon the sea\nPeaceful evening sky",
		"logprobs": null,
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 27,
		"total_tokens": 47,
		"completion_tokens": 20
	}
}

Models

List and describe the various AI models available the BOMML API. For additional information please check the Models documentation to understand what each model is capable of and the differences between them.

The model object

Describes an BOMML AI model that is available via the API.

id
string
The unique model identifier.

name
string
The name of the model.

key
string
The unique model key which can be referenced via API requests.

description
string
A short text description about the model.

created_at
integer
The Unix timestamp (in seconds) when the model was created/added.

updated_at
integer
The Unix timestamp (in seconds) when the model was updated/modified.
{
	"id": 1,
	"name": "Llama 2 70B",
	"key": "meta-llama\/Llama-2-70b-chat-hf",
	"description": null,
	"created_at": null,
	"updated_at": null
}

List models

GET https://api.bomml.ai/api/v1/models
Lists the currently available models, and provides basic information about each one such as the description, tasks etc.
Returns

An array of th model objects.
[{
	"id": 1,
	"name": "Llama 2 70B",
	"key": "meta-llama\/Llama-2-70b-chat-hf",
	"description": null,
	"created_at": null,
	"updated_at": null
}, {
	"id": 2,
	"name": "Llama-2-13b-chat-hf",
	"key": "meta-llama\/Llama-2-13b-chat-hf",
	"description": null,
	"created_at": null,
	"updated_at": null
}]