OpenAI has not too long ago launched new options that showcase an agent-like structure, such because the Assistant API. In response to OpenAI:
The Assistants API lets you construct AI assistants inside your individual purposes. An Assistant has directions and may leverage fashions, instruments, and information to answer consumer queries. The Assistants API at the moment helps three varieties of instruments: Code Interpreter, File Search, and Operate calling.
Whereas these developments are promising, they nonetheless lag behind LangChain. LangChain allows the creation of agent-like programs powered by LLMs with better flexibility in processing pure language enter and executing context-based actions.
Nonetheless, that is solely the start.
At a excessive degree, interplay with the Assistant API could be envisioned as a loop:
- Given a consumer enter, an LLM is known as to find out whether or not to supply a response or take particular actions.
- If the LLM’s resolution suffices to reply the question, the loop ends.
- If an motion results in a brand new remark, this remark is included within the immediate, and the LLM is known as once more.
- The loop then restarts.
Sadly, regardless of the introduced benefits, I discovered the documentation for the API to be poorly finished, particularly relating to interactions with customized perform calls and constructing apps utilizing frameworks like Streamlit.
On this weblog put up, I’ll information you thru constructing an AI assistant utilizing the OpenAI Assistant API with customized perform calls, paired with a Streamlit interface, to assist these considering successfully utilizing the Assistant API.
On this weblog put up, I’ll exhibit a easy instance: an AI assistant able to calculating tax based mostly on a given income. Langchain customers can simply come into thoughts implementing this by creating an agent with a “tax computation” instrument.
This instrument would come with the mandatory computation steps and a well-designed immediate to make sure the LLM is aware of when to name the instrument at any time when a query includes income or tax.
Nonetheless, this course of isn’t precisely the identical with the OpenAI Assistant API. Whereas the code interpreter and file search instruments can be utilized instantly in a simple method in keeping with OpenAI’s documentation, customized instruments require a barely totally different strategy.
assistant = shopper.beta.assistants.create(
identify="Knowledge visualizer",
description="You're nice at creating stunning knowledge visualizations. You analyze knowledge current in .csv information, perceive developments, and provide you with knowledge visualizations related to these developments. You additionally share a short textual content abstract of the developments noticed.",
mannequin="gpt-4o",
instruments=[{"type": "code_interpreter"}],
)
Let’s break it down step-by-step. We goal to:
- Outline a perform that computes tax based mostly on given income.
- Develop a instrument utilizing this perform.
- Create an assistant that may entry this instrument and name it at any time when tax computation is required.
Please observe that the tax computation instrument described within the following paragraph is designed as a toy instance to exhibit the way to use the API mentioned within the put up. It shouldn’t be used for precise tax calculations.
Think about the next piecewise perform, which returns the tax worth for a given income. Be aware that the enter is about as a string for less complicated parsing:
def calculate_tax(income: str):
attempt:
income = float(income)
besides ValueError:
elevate ValueError("The income ought to be a string illustration of a quantity.")if income <= 10000:
tax = 0
elif income <= 30000:
tax = 0.10 * (income - 10000)
elif income <= 70000:
tax = 2000 + 0.20 * (income - 30000)
elif income <= 150000:
tax = 10000 + 0.30 * (income - 70000)
else:
tax = 34000 + 0.40 * (income - 150000)
return tax
Subsequent, we outline the assistant:
function_tools = [
{
"type": "function",
"function": {
"name": "calculate_tax",
"description": "Get the tax for given revenue in euro",
"parameters": {
"type": "object",
"properties": {
"revenue": {
"type": "string",
"description": "Annual revenue in euro"
}
},
"required": ["revenue"]
}
}
}
]
# Outline the assistant
assistant = shopper.beta.assistants.create(
identify="Assistant",
directions="",
instruments=function_tools,
mannequin="gpt-4o",
)
Now, the important level:
How does the assistant use the perform when “calculate_tax” is known as? This half is poorly documented within the OpenAI assistant, and lots of customers would possibly get confused the primary time utilizing it. To deal with this, we have to outline an EventHandler
to handle totally different occasions within the response stream, particularly the way to deal with the occasion when the “calculate_tax” instrument is known as.
def handle_requires_action(self, knowledge, run_id):
tool_outputs = []for instrument in knowledge.required_action.submit_tool_outputs.tool_calls:
if instrument.perform.identify == "calculate_tax":
attempt:
# Extract income from instrument parameters
income = ast.literal_eval(instrument.perform.arguments)["revenue"]
# Name your calculate_tax perform to get the tax
tax_result = calculate_tax(income)
# Append instrument output within the required format
tool_outputs.append({"tool_call_id": instrument.id, "output": f"{tax_result}"})
besides ValueError as e:
# Deal with any errors when calculating tax
tool_outputs.append({"tool_call_id": instrument.id, "error": str(e)})
# Submit all tool_outputs on the similar time
self.submit_tool_outputs(tool_outputs)
The code above works as follows: For every instrument name that requires motion:
- Examine if the perform identify is “calculate_tax”.
- Extract the income worth from the instrument parameters.
- Name the
calculate_tax
perform with the income to compute the tax. (That is the place the true interplay occurs.) - After processing all instrument calls, submit the collected outcomes.
Now you can work together with the assistant following these normal steps documented by OpenAI (for that purpose, I cannot present many particulars on this part):
- Create a thread: This represents a dialog between a consumer and the assistant.
- Add consumer messages: These can embrace each textual content and information, that are added to the thread.
- Create a run: Make the most of the mannequin and instruments related to the assistant to generate a response. This response is then added again to the thread.
The code snippet under demonstrates the way to run the assistant in my particular use case: The code units up a streaming interplay with an assistant utilizing particular parameters, together with a thread ID and an assistant ID. An EventHandler
occasion manages occasions in the course of the stream. The stream.until_done()
technique retains the stream lively till all interactions are full. The with
assertion ensures that the stream is correctly closed afterward.
with shopper.beta.threads.runs.stream(thread_id=st.session_state.thread_id,
assistant_id=assistant.id,
event_handler=EventHandler(),
temperature=0) as stream:
stream.until_done()
Whereas my put up might finish right here, I’ve seen quite a few inquiries on the Streamlit discussion board (like this one) the place customers battle to get streaming to work on the interface, although it capabilities completely within the terminal. This prompted me to delve deeper.
To efficiently combine streaming into your app, you’ll want to increase the performance of the EventHandler class talked about earlier, particularly specializing in dealing with textual content creation, textual content deltas, and textual content completion. Listed below are the three key steps required to show textual content within the Streamlit interface whereas managing chat historical past:
- Dealing with Textual content Creation (
on_text_created
): Initiates and shows a brand new textual content field for every response from the assistant, updating the UI to mirror the standing of previous actions. - Dealing with Textual content Delta (
on_text_delta
): Dynamically updates the present textual content field because the assistant generates textual content, enabling incremental modifications with out refreshing your complete UI. - Dealing with Textual content Completion (
on_text_done
): Finalizes every interplay section by including a brand new empty textual content field, getting ready for the following interplay. Moreover, it information accomplished dialog segments inchat_history
.
As an illustration, contemplate the next code snippet for managing textual content deltas:
def on_text_delta(self, delta: TextDelta, snapshot: Textual content):
"""
Handler for when a textual content delta is created
"""
# Clear the newest textual content field
st.session_state.text_boxes[-1].empty()# If there's new textual content, append it to the newest factor within the assistant textual content record
if delta.worth:
st.session_state.assistant_text[-1] += delta.worth
# Re-display the up to date assistant textual content within the newest textual content field
st.session_state.text_boxes[-1].information("".be a part of(st.session_state["assistant_text"][-1]))
This code accomplishes three most important duties:
- Clearing the Newest Textual content Field: Empties the content material of the newest textual content field (
st.session_state.text_boxes[-1]
) to organize it for brand spanking new enter. - Appending Delta Worth to Assistant Textual content: If new textual content (
delta.worth
) is current, it appends this to the continued assistant textual content saved inst.session_state.assistant_text[-1]
. - Re-displaying Up to date Assistant Textual content: Updates the content material of the newest textual content field to mirror the mixed content material of all assistant textual content gathered thus far (
st.session_state["assistant_text"][-1]
).
This weblog put up demonstrated the way to use the OpenAI Assistant API and Streamlit to construct an AI assistant able to calculating tax.
I did this easy challenge to spotlight the capabilities of the Assistant API, regardless of its less-than-clear documentation. My purpose was to make clear ambiguities and supply some steerage for these considering utilizing the Assistant API. I hope this put up has been useful and encourages you to discover additional prospects with this highly effective instrument.
Attributable to house constraints, I’ve tried to keep away from together with pointless code snippets. Nonetheless, if wanted, please go to my Github repository to view the whole implementation.