As reported by Multiplatform.ai, Microsoft has developed a new way to train large language models (LLMs) to understand better and generate text in specific domains. This new method is more cost-effective than previous methods, producing LLMs that perform better on domain-specific tasks.
LLMs are good at understanding and generating text in a general sense. Still, they are not as good at understanding and generating text in specific domains, such as biology, finance, or law.
They explored three main approaches to creating these specialized programs. The first approach involves building a program from the ground up, but it’s complex and resource-intensive. The second approach refines existing programs with additional training, which may not work equally well for all tasks. The third approach, which Microsoft decided to focus on, leverages existing knowledge about a field to teach the program.
The third approach is a new method for customizing LLMs for specific domains called domain-adaptive pretraining. Domain-adaptive pretraining involves training an LLM on a large text dataset from a particular domain. This training helps the LLM learn the vocabulary and concepts important in the domain.
Microsoft researchers have found that domain-adaptive pretraining can be done more cost-effectively by transforming raw corpora into reading comprehension texts. Reading comprehension texts are questions about a piece of text, and the answers to the questions require the reader to understand the text.
Microsoft researchers have shown that AdaptLLM, a model trained using domain-adaptive pretraining on reading comprehension texts, is better.