Documentation - Chat

Overview

OpenAI offers many ways to approach using their chat models in software the main consideration is whether to use a “bulk response” or a “streaming response” in your project.

 

The most obvious difference is that with a “bulk response” there is a slight delay before your user will see anything from the model and then the entire response will appear all at once, whereas with “streaming response” the model will respond with each token (3-4 characters including spaces and punctuation, etc.) as they are generated. Many users prefer to see a “streaming response” because it seems both faster and more natural instead of seeing a paragraph or more suddenly appear after a pause.

Side Note: Tool Usage is “Bulk Response” Only

Tool Calls (functions) are currently only supported (by OpenAI) when using the “bulk response” method. OpenAI says Tool Calls will be added to “streaming responses” in the future and the “ChatCrafters AI Suite” will be updated as soon as they make this change.

Basic Chat Model Usage

Using a “bulk response” is the easiest way to use the OpenAI Chat models.

Basic Chat Conversation

Using a “bulk response” to create a simple chat bot using OpenAI Chat models.

Streaming Chat Conversation

Using a “streaming response” is the slightly more involved than “bulk chat”, but mostly only because the output needs more handling.

Chat Conversation with Tool Calls

Currently, OpenAI only offers the use of Tool Calls (functions) while using the “bulk response”.

2024 Chat Crafters, LLC