Skip to main content

Chat Completion (Chat)

URL

POST https://demo.cdc.datenfab.com/model/api/v1/chat/completions

Request Headers

  • Content-Type: application/json
  • Authorization: Bearer $YOUR_API_KEY

Request Body Parameters

NameTypeRequiredDescription
modelstringYesThe name of the model to use.
messagesarrayYesAn array of message objects, each containing role and content. The role represents the author's role, including system, user, or assistant. The content holds the content of the message.
temperaturefloatNoControls the creativity or randomness of generated text. Default is 0.7. A value between 0 and 1. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused. It is recommended to adjust either this or top_p, but not both simultaneously.
top_pfloatNoAlso known as nucleus sampling, this parameter controls the diversity of token selection. Default is 0.7. In nucleus sampling, the model considers results from the top tokens with cumulative probability top_p. A value of 0.1 means only the top 10% of tokens by probability are considered. Maximum value is 1. It is recommended to adjust this or temperature, but not both at the same time.
max_tokensintegerNoDefault is 200. The maximum number of tokens that can be generated in a chat completion. The maximum value is 4096.
presence_penaltyfloatNoControls the frequency of occurrence of certain words or phrases. Default is 0. Values range between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, encouraging the model to talk about new topics.
frequency_penaltyfloatNoAdjusts token frequency in generated text. Default is 0. Values range between -2.0 and 2.0. Positive values penalize tokens based on their existing frequency in the text, reducing the chance of repeated lines.
stopstringNoA sequence of tokens where the API will stop generating further output. The returned text will not contain the stop sequence.
streamboolNoIf set to true, messages will be sent incrementally in a stream. The stream ends with a data: [DONE] message.

Response

  • Status Code: 200 OK
  • Body:
{
"id": "cmpl-bd29f845b8d041cbae17771a2d2112b6",
"object": "chat.completion",
"created": 1721965175,
"model": "Qwen/Qwen2-7B-Instruct",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Large models, also known as large-scale pretrained models or ultra-large models.",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": "\n"
}
],
"usage": {
"prompt_tokens": 30,
"total_tokens": 122,
"completion_tokens": 92
}
}

Example API Call

curl -X POST "https://demo.cdc.datenfab.com/model/api/v1/chat/completions" \
-H "Authorization: Bearer $YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2-7B-Instruct",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"temperature": 0.7,
"top_p": 0.7,
"max_tokens": 200
}'