LLM Fine-tuning Use Case

Customer Service Automation

A customer service call can cost an organization upwards of $40. Automating customer support processes can reduce overhead by millions of dollars. Learn how to fine-tune open-source LLMs to automatically classify support issues and generate a customer response.

The Predibase Solution

Streamline customer support operations with LLMs

  • Efficiently fine-tune open-source LLMs like Llama2 on customer support transcripts
  • Instantly deploy and prompt your fine-tuned LLM on serverless endpoints
  • Automate issue identification and generate content for agents to respond

Unestructured Text

Call transcripts
Call transcripts
Customer Emails
Customer Emails
Chat Logs
Chat Logs
Product Knowledge Base
Product Knowledge Base
Customer Service 
Automation with LLMs

Use Cases

Automatically prioritize tickets based on issue
Automatically prioritize tickets based on issue
Classify and route issues to the appropriate teams
Classify and route issues to the appropriate teams
Generate customer responeses to help agents be more efficient
Generate customer responeses to help agents be more efficient

Fine-tune and serve your own LLM for Customer Support

Efficiently fine-tune any open-source LLM with built-in optimizations like quantization, LoRA, and memory-efficient distributed training combined with right-sized GPU engines. Instantly serve and prompt your fine-tuned LLMs with cost-efficent serverless endpoints built on top of open-source LoRAX. Read the full tutorial.

# Kick off the fine-tune job and track the learning curves for your adapter in the Predibase UI
adapter = pb.finetuning.jobs.create(
    config={
        "base_model": "HuggingFaceH4/zephyr-7b-beta",  # specify a HuggingFace LLM to fine-tune
        "epochs": 5,
        "learning_rate": 0.0002,
    },
    dataset=my_dataset,  # Upload your dataset to Predibase beforehand.
    repo="my_adapter",
    description='Fine-tune "zephyr-7b-beta" on my customer support dataset for classifying call intents.',
)

# Dynamically load fine-tuned adapter for serverless inference
prompt = """
    Consider the case of a customer contacting the support center.
    The term "task type" refers to the reason for why the customer contacted support.

    ### The possible task types are: ###
    - replace card
    - transfer money
    - check balance
    - order checks
    - pay bill
    - reset password
    - schedule appointment
    - get branch hours
    - none of the above

    Summarize the issue/question/reason that drove the customer to contact support:

    ### Transcript:
    <caller> hello <agent> hello this is [unintelligible] national bank my name is jennifer <agent> how can i help you today <caller> hi my name is james william <caller> i lost my debit card <caller> can you send me a new one <agent> yes <agent> uh which card or would you like to replace <caller> my debit card <agent> okay i've ordered your replacement debit card is there anything else i can help you with today <caller> no that's gonna be all for me today <agent> [noise] <agent> alright thank you for calling have a great day <caller> you too bye <agent> [noise] <agent> [noise]

    ### Task Type:
"""

print(
    client.generate(
        prompt,
        adapter_id="my_task_type_adapter/1",  # Specify adapter/version for inference
        max_new_tokens=256,
        temperature=0.1,
    ).generated_text
)

Example code in Predibase for illustrative purposes only

Resources to Get Started

Ready to efficiently fine-tune and serve your own LLM?