Agentic AI at Scale: Marsh McLennan Saves 1M+ Hours

Large Language Models (LLMs) have the potential to create greater efficiency, productivity, and savings for enterprises, but getting them to perform accurately for specific use cases remains a challenge. While these models possess vast general knowledge, they often need refinement to truly excel in specialized contexts. Agentic AI, or AI agents, can bridge the gap between theoretical capabilities and practical utility by performing tasks and solving problems specific to an organization's needs.

Using Predibase's fine-tuning capabilities for a couple of specific use cases, global professional services firm Marsh McLennan launched LenAI–a powerful AI assistant–that enables Marsh McLennan employees to leverage institutional knowledge and provide leading industry expertise to customers quickly and accurately. Their journey demonstrates how thoughtful model customization can bridge the gap between AI's theoretical capabilities and real-world impact.

Background

Marsh McLennan is the world’s leading professional services firm in the areas of risk, strategy and people, with more than 90,000 employees advising clients in 130 countries. Their innovative AI assistant, LenAI, provides Marsh McLennan employees with instant access to accumulated expertise to find the right resources and answers to deliver to customers quickly.

Marsh McLennan’s Chief Information and Operations Officer Paul Beswick leads 5,000 technologists. Under his leadership, the company adopted a forward-looking approach to generative AI, betting on its potential value before many organizations recognized its enterprise-grade applications. According to Paul, the company decided to move quickly and proactively:

“Our journey with generative AI started before it became a mainstream tool,” Paul explains. “By early 2023, we were making secure APIs available to our teams. By mid-year, we put an LLM-based assistant into pilot, and soon after that, LenAI was launched to our entire global workforce. Today, we handle around 20 million requests annually. It’s become something nearly everyone uses, with upwards of 90,000 people tapping into it.”

Limited intent recognition hampers effectiveness

LenAI, which was initially built on GPT 3.5, started as a chat client that was available to answer team member questions. After about a month, they expanded it to connect many of their internal infrastructures, including platforms like Microsoft SharePoint, internal news feeds, media assets, and specialized company-specific tools. By uniting many sources of truth into one single chat interface, their experts were better equipped to support clients.

The base GPT 3.5 model occasionally faced difficulties with intent recognition—a crucial capability for any agentic AI–sometimes misinterpreting the correct tool based on a user's query. For example, when users requested simple tasks like drafting emails, the model would return documents containing the keyword ‘email’ rather than launching the email client. This semantic gap between user intent and system response highlighted the need for fine-tuning.

Fine-tuning LLMs for better results

Marsh McLennan realized that to reach the next level of accuracy and drive task automation with higher precision, they would need to fine-tune their models. To do so, Marsh McLennan worked with Predibase, which leverages techniques like low-rank adaptation (LoRA).

“With Predibase, the complexity dropped,” says Paul. “This unlocked a new wave of automation use cases.”

7% increased accuracy over 4o-mini

Fine-tuning not only ensured data sovereignty but also improved the accuracy of their fine-tuned Llama-3.1-8B-instruct model compared to GPT 3.5 and 4o-mini, a notable 10-12% increase over 3.5 and a 7% improvement over 4o-mini.

Furthermore, fine-tuning with Turbo LoRA–Predibase’s innovation that accelerates SLM throughput while improving accuracy–and deploying on a GPU that supports FP8 quantization enabled LenAI to reduce round trip request time below 4 seconds, something previously impossible with GPT 3.5.

The team recently reached an important milestone. Since launching in December of 2023, LenAI has supported over 25 million cumulative queries across the organization. This translated to a massive increase in productivity, saving at least 1 million hours of team time in its first year alone.

“Since then, we’ve put the power into our colleagues' hands by allowing them to experiment with the same technology that LenAI is using,” said Paul. “We’re offering it to any development team across the organization who wants to experiment with it for a new use case, meaning that the success we’ve seen so far will only grow over time.”

Conclusion

Any organization's backbone is its people and the knowledge and years of experience they draw upon to improve their clients’ businesses. For Marsh McLennan, while not fine-tuning the LLM, has found specific use cases where a fine-tuned AI model powered by Predibase helped them empower their employees with the accurate information they needed to do the job efficiently every day.

Building on this success, Marsh McLennan continues to expand LenAI’s capabilities. By democratizing access to LenAI and fine-tuning technology across development teams, the company fosters innovation and radically transforms enterprise knowledge management.

What could you achieve with Predibase’s fine-tuning capabilities? Try it for yourself with our 30-day free trial.

FAQ

Why did Marsh McLennan fine-tune their LLMs instead of using GPT-3.5 or GPT-4o-mini?

Off-the-shelf models like GPT-3.5 and GPT-4o-mini struggled with intent recognition and accuracy in specialized enterprise use cases. By fine-tuning LLaMA 3.1-8B-Instruct using Predibase and Turbo LoRA, Marsh McLennan improved accuracy by 7–12% and reduced response latency below 4 seconds.

What is agentic AI and how does it help enterprises?

Agentic AI refers to autonomous AI agents that can perform actions and make decisions on behalf of users. In enterprise settings, agentic AI systems like LenAI automate workflows, access internal systems, and generate outputs tailored to business-specific tasks—saving time and improving productivity.

How does Predibase support enterprise LLM fine-tuning?

Predibase provides an end-to-end platform for efficient fine-tuning using methods like LoRA and Turbo LoRA. It enables organizations to maintain data sovereignty, accelerate model throughput, and fine-tune models like LLaMA or Mistral without managing complex infrastructure.

What are the benefits of Turbo LoRA in LLM training?

Turbo LoRA is a Predibase innovation that enhances LoRA by increasing throughput and supporting lower-precision quantization (e.g., FP8). This allows for faster training and inference while preserving or improving model accuracy—crucial for latency-sensitive enterprise use cases. Read more about Turbo LoRA.

How much productivity did Marsh McLennan gain from fine-tuning LLMs?

By fine-tuning LLaMA models using Predibase, Marsh McLennan saved over 1 million hours of team time in less than a year. This was achieved through faster, more accurate responses from their agentic AI system, LenAI, and increased employee efficiency.

Agentic AI at Scale: Marsh McLennan Saves 1M+ Hours

Background

Limited intent recognition hampers effectiveness

Fine-tuning LLMs for better results

7% increased accuracy over 4o-mini

Conclusion

FAQ

Why did Marsh McLennan fine-tune their LLMs instead of using GPT-3.5 or GPT-4o-mini?

What is agentic AI and how does it help enterprises?

How does Predibase support enterprise LLM fine-tuning?

What are the benefits of Turbo LoRA in LLM training?

How much productivity did Marsh McLennan gain from fine-tuning LLMs?

Related Articles

Training an Expert Coding Agent with Reinforcement Fine-Tuning

Fine-Tuned SLMs Help Checkr Optimize Background Checks

Koble’s Case Study: AI-Driven Startup Investing

Join Our Community!