Harness LLM Output-parsers for a Structured Ai
Unlock the Full Guide to Easy Setup of Output Parsers: From CommaSeparatedList to Pydantic, JSON and more.
Hi everyone today we will look into some powerful pre-build functions of LLM that will help us format the output/results. We will call it LLM OutputParsers
Here are the commonly used:
- CommaSeparatedListOutputParser
- StructuredOutputParser/Response Schema
- JsonOutputParser: pydantic & without pydantic
- PydanticOutputParser
- DatetimeOutputParser
Let’s get started.
Likewise in our previous articles, we will be using Huggingface model API calls which provide better token limits than open ai.
First login to the Hugging face and generate the API key(Access Token)
#######################################################
#Step up the LLM Environment
#######################################################
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain
##################################################
#Model API call
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceEndpoint(
repo_id=repo_id,
max_length=128,
temperature=0.5,
huggingfacehub_api_token= "hf_yourkey")
CommaSeparatedListOutputParser: Parse the output of an LLM call to a comma-separated list.
#############################################
# CommaSeparatedListOutputParser ############
#############################################
from langchain.output_parsers import CommaSeparatedListOutputParser
#CommaSeparatedListOutputParser: Parse the output of an LLM call to a comma-separated list.
output_parser = CommaSeparatedListOutputParser()
#view the format
format_instructions = output_parser.get_format_instructions()
format_instructions
prompt = PromptTemplate(
template="Provide 5 examples of {query}.\n {format_instructions}",
input_variables=["query"],
partial_variables={"format_instructions": format_instructions}
)
llm = llm #misterial
prompt = prompt.format(query="Currencies")
output = llm.invoke(prompt)
print(output)
#another approach using LCEL ------------
output_parser = CommaSeparatedListOutputParser()
prompt = PromptTemplate(
template="Provide 5 examples of {query}.\n {format_instructions}",
input_variables=["query"],
partial_variables={"format_instructions": format_instructions}
)
llm1 = prompt|llm |output_parser
results = llm1.invoke({"query" : "chocolates"})
2. Json Format using StructuredOutputParse, ResponseSchema
#StructuredOutputParser: This output parser can be used when you want to return multiple fields. While the Pydantic/JSON parser is more powerful, this is useful for less powerful models.
#ResponseSchema: Schema for a response from a structured output parse.
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
#StructuredOutputParser:This output parser can be used when you want to return multiple fields. While the Pydantic/JSON parser is more powerful, this is useful for less powerful models.
#ResponseSchema: Schema for a response from a structured output parse
response_schemas = [
ResponseSchema(name="currency", description="answer to the user's question"),
ResponseSchema(name="abbrevation", description="Whats the abbrebation of that currency")
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
print(output_parser)
format_instructions = output_parser.get_format_instructions()
print(format_instructions)
prompt = PromptTemplate(
template="answer the users question as best as possible.\n{format_instructions}\n{query}",
input_variables=["query"],
partial_variables={"format_instructions": format_instructions}
)
#prompt = prompt.format(query="list me the currencies of europe")
output = llm.invoke(prompt.format(query="list me the currencies of europe"))
print(output)
#Scenario without currency
#prompt = prompt.format(query="list me the chcolates?")
output = llm.invoke(prompt.format(query="list me the chcolates?"))
print(output)
if you wish to use for chocolates try to change the Response Schema
ResponseSchema(name="chocolates", description="answer to the user's question"),
ResponseSchema(name="abbrevation", description="Whats the abbrebation of that chocolate")
3.JsonOutputParser: pydantic & without pydantic
Parses the output of an LLM call to a JSON object
#Pydantic
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
# Define your desired data structure.
class Joke(BaseModel):
setup: str = Field(description="question to set up a joke")
punchline: str = Field(description="answer to resolve the joke")
# And a query intented to prompt a language model to populate the data structure.
joke_query = "Tell me a joke."
# Set up a parser + inject instructions into the prompt template.
parser = JsonOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
template="Answer the user query.\n{format_instructions}\n{query}\n",
input_variables=["query"],
partial_variables={"format_instructions": parser.get_format_instructions()},
)
chain = prompt | llm | parser
chain.invoke({"query": joke_query})
#without Pydantic --------------------------
joke_query = "Tell me a joke."
parser = JsonOutputParser()#without pydantic_object
prompt = PromptTemplate(
template="Answer the user query.\n{format_instructions}\n{query}\n",
input_variables=["query"],
partial_variables={"format_instructions": parser.get_format_instructions()},
)
chain = prompt | llm | parser
chain.invoke({"query": joke_query})
4.PydanticOutputParser
Parses an output using a pydantic model.
#######################################################
# PydanticOutputParser ################################
#######################################################
#PydanticOutputParser: Parse an output using a pydantic model.
from langchain.prompts import ChatPromptTemplate
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List
llm = llm
# Define your desired data structure.
class BrandInfo(BaseModel):
brand_name: str = Field(description="This is the name of the brand")
reasoning: str = Field(description="This is the reasons for the score")
likelihood_of_success: int = Field(description="This is an integer score between 1-10")
# You can add custom validation logic easily with Pydantic.
@validator('likelihood_of_success')
def check_score(cls, field):
if field >10:
raise ValueError("Badly formed Score")
return field
# Set up a parser + inject instructions into the prompt template.
pydantic_parser = PydanticOutputParser(pydantic_object=BrandInfo)
format_instructions = pydantic_parser.get_format_instructions()
template_string = """You are a master branding consulatant who specializes in naming brands. \
You come up with catchy and memorable brand names.
Take the brand description below delimited by triple backticks and use it to create the name for a brand.
brand description: ```{brand_description}```
then based on the description and you hot new brand name give the brand a score 1-10 for how likely it is to succeed.
{format_instructions}
"""
prompt = ChatPromptTemplate.from_template(template=template_string)
messages = prompt.format_messages(brand_description="a cool hip new sneaker brand aimed at rich kids",
format_instructions=format_instructions)
messages[0].content
output = llm(messages[0].content)
5.DatetimeOutputParser
from langchain.output_parsers import DatetimeOutputParser
output_parser = DatetimeOutputParser(format="%d-%m-%Y") #the default format= “%Y-%m-%dT%H:%M:%S.%fZ”
format_instructions = output_parser.get_format_instructions()
print(format_instructions)
prompt = PromptTemplate(
template="{question}\n{format_instructions}",
input_variables=["question"],
partial_variables={"format_instructions": format_instructions}
)
llm1 = llm
chain = prompt | llm1 | output_parser
result = chain.invoke({"question":"when is the indepence day of india ?"})
That’s it. We have learned different ways of formatting the LLM outputs which play an important role in API calls or integrating with any external applications.
Once again, thanks again for your time. i hope you enjoyed this. I tried my best to gather details across and simply as much as possible i could.
In the next article, we will explore ways to prompt our LLM because Prompt tunning is not Prompt Engineering.
Until then feel free to reach out. Thanks for your time, if you enjoyed this short article there are tons of topics in advanced analytics, data science, and machine learning available in my medium repo. https://medium.com/@bobrupakroy
Some of my alternative internet presences are Facebook, Instagram, Udemy, Blogger, Issuu, Slideshare, Scribd, and more.
Also available on Quora @ https://www.quora.com/profile/Rupak-Bob-Roy
Let me know if you need anything. Talk Soon.
Check out the links i hope it helps.