A Simple Guide to the Most Powerful Pre-Built LLM Callback Handlers

Rupak (Bob) Roy - II
5 min readAug 14, 2024

--

Learn About Callback Handlers Like StdOutCallbackHandler: Key Events, Common Use Cases, and How to Implement Callbacks in LLMs.

Hi everyone, today we will look into LLM callbacks which is one of the built-in powerful functions that can we use to handle complex usecases.

So what are LLM CallBacks?

LLM Callbacks refer to functions or mechanisms that are triggered at specific points during the processing of a prompt or the generation of a response. They allow developers to intervene, log, or modify the behavior of the model at various stages of the interaction. Callbacks are particularly useful for monitoring, debugging, and customizing the model’s output.

How Callbacks Work?
Triggered Events: Callbacks are tied to specific events or stages in the processing workflow, such as before the prompt is sent to the model, after the model generates a response, or when an error occurs.
Custom Functions: Developers can define custom functions that are executed when these events are triggered. These functions can perform various tasks, such as logging information, modifying the input or output, or even halting the process.

Common Use Cases
Logging: Track the inputs, outputs, and intermediate states to understand how the model is performing or to keep a record of interactions.

Example: Log every response generated by the model for later analysis.
Error Handling: Automatically handle errors or exceptions that occur during processing.

Example: If the model fails to generate a response, a callback could retry the request or provide a default response.
Input/Output Modification: Adjust the prompt or the model’s response before it is finalized.

Example: Automatically append additional context to the prompt before it is sent to the model.
Monitoring: Keep track of performance metrics, such as response time or token usage.

Example: Measure how many tokens the model uses for each response and log this data.

Example: Suppose you are using a language model to generate customer support responses. A callback could be set up to check if the generated response contains certain keywords (like “refund” or “escalate”) and automatically flag these for review by a human agent.

we will initialize the LLM using Huggingface model API calls which provide better token limits than open ai.

First login to the Hugging face and generate the API key(Access Token)

Huggingface
#######################################################
#Step up the LLM Environment
#######################################################
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain

##################################################
#Model API call
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceEndpoint(
repo_id=repo_id,
max_length=128,
temperature=0.5,
huggingfacehub_api_token= "hf_yourkey")

#StdOutCallbackHandler(): which simply logs all events to stdout


from langchain.callbacks import StdOutCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate(input_variables=["input"],
template="Tell me a joke about {input}")
chain = LLMChain(llm=llm, prompt=prompt_template)

handler = StdOutCallbackHandler()
#The most basic handler is the StdOutCallbackHandler, which simply logs all events to stdout
chain.invoke(input="icecream", callbacks=[handler])

#Example 2: by creating a Custom Callback


#-------------------------------------------
#Example 2 ---------------------------------
#-------------------------------------------
from langchain.callbacks.base import BaseCallbackHandler

class MyCustomHandler(BaseCallbackHandler):
def on_llm_end(self, response, **kwargs) -> None:
print(f"REPONSE: ", response)

chain.invoke(input="icecream", callbacks=[MyCustomHandler()])

#Example 3:


from langchain_core.callbacks import StdOutCallbackHandler
from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate

handler = StdOutCallbackHandler()
llm = llm #already initialized above
prompt = PromptTemplate.from_template("1 + {number} = ")

# Constructor callback: First, let's explicitly set the StdOutCallbackHandler when initializing our chain
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[handler])
chain.invoke({"number":2})

# Use verbose flag: Then, let's use the `verbose` flag to achieve the same result
chain = LLMChain(llm=llm, prompt=prompt, verbose=True)
chain.invoke({"number":2})

# Request callbacks: Finally, let's use the request `callbacks` to achieve the same result as verbose = True
chain = LLMChain(llm=llm, prompt=prompt)
chain.invoke({"number":2}, {"callbacks":[handler]})
chain.invoke({“number”:2}, {“callbacks”:[handler]})

#Example 4: Langchain has a convenience context manager function to make it easy to track costs, token usage, etc.
We have already seen this commonly used to track the cost

from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
chain.invoke({"number":2})

print(cb.total_cost)
print(cb.total_tokens)

#you might get 0 values in total cost as we are using huggingface api and not openai

Where to pass CallBacks?

  1. Constructor callbacks: defined in the constructor, e.g. LLMChain(callbacks=[handler], tags=[‘a-tag’])
  2. Request callbacks: defined in the ‘invoke’ method used for issuing a request. In this case, the callbacks will be used for that specific request only. In this method callbacks are passed through the config parameter
handler = StdOutCallbackHandler()
llm = llm
prompt = PromptTemplate.from_template("1 + {number} = ")
config = {
'callbacks' : [handler]
}
chain = prompt | chain
chain.invoke({"number":2}, config=config)

that's it. I hope you will find the base callback templates useful for Real-Time Feedback based on the model’s output with enhanced debugging, efficiency, and customization.

Next, we will go through LLM evaluation using prompting like LLM Scores, Memory Scores, Reteriver Scores, Bleu, Rouge-score, and more.

Thanks for your time, if you enjoyed this short article there are tons of topics in advanced analytics, data science, and machine learning available in my medium repo. https://medium.com/@bobrupakroy

Some of my alternative internet presences are Facebook, Instagram, Udemy, Blogger, Issuu, Slideshare, Scribd, and more.

Also available on Quora @ https://www.quora.com/profile/Rupak-Bob-Roy

Let me know if you need anything. Talk Soon.

Check out the links i hope it helps.

udemy: https://www.udemy.com/user/rupak-roy-2/
try out this place!

--

--

Rupak (Bob) Roy - II

Things i write about frequently on Medium: Data Science, Machine Learning, Deep Learning, NLP and many other random topics of interest. ~ Let’s stay connected!