Data is the Foundation of Language Models | by Cameron R. Wolfe, Ph.D. | Oct, 2023


How high-quality data impacts every aspect of the LLM training pipeline…

(Photo by Joshua Sortino on Unsplash)

Large Language Models (LLMs) have been around for quite some time, but only recently has their impressive performance warranted significant attention from the broader AI community. With this in mind, we might begin to question the origin of the current LLM movement. What was it that actually made recent models so impressive compared to their predecessors? Although some may argue a variety of different factors, one especially impactful advancement was the ability to perform alignment. In other words, we figured out how to train LLMs to not just output the most likely next word, but to output text will satisfy the goals of a human, whether it be by following an instruction or retrieving important information.

“We hypothesize that alignment can be a simple process where the model learns the style or format for interacting with users, to expose the knowledge and capabilities that were already acquired during pretraining” — from [1]

This overview will study the role and impact of alignment, as well as the interplay between alignment and pre-training. Interestingly, these ideas were explored by the recent LIMA model [1], which performs alignment by simply fine-tuning a pre-trained LLM over a semi-manually curated corpus of only 1,000 high-quality response examples. We will learn that the alignment process, although critical, primarily teaches an LLM steerability and correct behavior or style, while most knowledge is gained during pre-training. As such, alignment can be performed successfully even with minimal training data. However, we will see that the impact of data quality and diversity on both alignment and other avenues of LLM training (e.g., pre-training, fine-tuning, etc.) is absolutely massive.

“LLMs are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.” — from [1]

Although language models have been studied from a variety of different perspectives in recent months, the creation of these…



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*