How data collaboration platforms can help companies build better AI

By José Parra-Moyano, Karl Schmedders, and Alex “Sandy” Pentland

Large language models (LLMs) like GPT-4 have captivated business leaders with the promise of enhanced decision-making, streamlined operations, and new innovation. Companies such as Zendesk and Slack have started using LLMs to advance customer support, improving satisfaction and reducing costs. Meanwhile, Goldman Sachs and GitHub are employing a similar AI to assist developers with code writing. Likewise, the company Unilever is using LLMs to help it respond to messages from customers, generate product listings, and even minimize food waste. Yet, off the shelf, LLMs don’t offer the plug-and-play solution companies might be hoping for. When confronted with an organization’s unique context, they often underperform.

To conquer this challenge, business leaders have turned to fine-tuning LLMs that are then trained with organization-specific data, enabling them to master an organization’s nuances and unique quirks. Armed with greater context and tailored to the organization’s needs, fine-tuned models deliver a powerful, customized AI experience that dramatically elevates organizational performance. Bloomberg’s BloombergGPT — an AI research model fine-tuned with Bloomberg’s proprietary data — exemplifies how fine-tuned models help companies gain a strategic advantage by tailoring AI models with industry-specific data.

However, there are three immediate challenges for companies that want to train fine-tuned models. First, fine-tuned models require extensive, high-quality data — a scarce resource for many enterprises. Second, LLMs are trained on publicly available data from the internet, and thus they may not account for the nuances of specific communities or users, resulting in biased answers and a lack of diversity and pluralism in the generated content. Third, training fine-tuned models with users’ personal data may result in privacy violations, as the personal data was originally gathered with a different purpose.

Related Content