Skip to content
OnMSFT.com
  • Home
  • About
  • Contact
  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Edge
  • Teams
  • Gaming
Menu
  • Home
  • About
  • Contact
  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Edge
  • Teams
  • Gaming
  1. Home
  2. News
  3. Tech giants turn to synthetic data to power advanced AI models

Tech giants turn to synthetic data to power advanced AI models

OnMSFT Staff OnMSFT Staff
July 19, 2023
2 min read

As reported by Financial Times, AI companies are exploring a new approach to obtain data for powerful generative models: generating information from scratch using synthetic data. Microsoft, OpenAI, and Cohere are among those employing synthetic data—computer-generated information—to train their large language models (LLMs) due to limitations in human-made data.

The launch of Microsoft-backed OpenAI’s ChatGPT has led to various products that generate plausible text, images, or code based on simple prompts. Generative AI has attracted significant interest, with tech giants like Google, Microsoft, and Meta competing.

LLMs powering chatbots like ChatGPT and Google’s Bard primarily rely on web scraping techniques to accumulate data from books, articles, social media, videos, and more.

However, as generative AI software becomes increasingly sophisticated, AI companies face data access and privacy concerns challenges. Synthetic data offers a solution by being cost-effective.

Cohere and competitors use synthetic data generated by AI models and fine-tuned by humans. For example, Cohere might use two AI models simulating a conversation between a math tutor and a student to train a model on advanced mathematics.

Recent research from Microsoft shows synthetic data can effectively train smaller, simpler models. One instance involved a synthetic dataset of short stories generated by GPT-4, which trained a simple LLM to produce coherent and grammatically correct stories.

Startups like Scale AI and Gretel.ai offer synthetic data services, preserving privacy and removing biases. Synthetic data helps financial institutions examine fraud scenarios and other applications.

Critics warn using AI-generated raw data could degrade the technology over time with falsehoods. Nevertheless, AI researchers see synthetic data as a path to superintelligent AI that can create knowledge and ask questions.

Related

Share this article:
Previous Article Google Chrome users now receiving Bing Chat invitations Next Article 22 House Representatives petition FTC to stop opposing Microsoft’s Activision merger

Related Articles

Chrome and Gemini icons representing Gemini Live voice assistant integration in Chrome

Chrome tests Gemini Live voice assistant in a floating overlay panel

March 14, 2026

Chrome’s Organizer feature may sync Gemini and AI conversations across devices

March 14, 2026

After Chrome, Edge tests launching the browser automatically when you sign into Windows

March 13, 2026

Leave a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Chrome tests Gemini Live voice assistant in a floating overlay panel
  • Chrome’s Organizer feature may sync Gemini and AI conversations across devices
  • After Chrome, Edge tests launching the browser automatically when you sign into Windows
  • iPhone Fold Latest Rumors: Display, Cameras, RAM and Price Details Revealed
  • Samsung fears first mobile operating loss due to memory price surge

Recent Comments

No comments to show.
OnMSFT.com

OnMSFT.com covers Microsoft news, reviews, and how-to guides. Formerly known as WinBeta, we have been your source for Microsoft news since 1998.

Categories

  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Gaming
  • Edge
  • Teams

Recent Posts

  • Chrome tests Gemini Live voice assistant in a floating overlay panel
  • Chrome’s Organizer feature may sync Gemini and AI conversations across devices
  • After Chrome, Edge tests launching the browser automatically when you sign into Windows
  • iPhone Fold Latest Rumors: Display, Cameras, RAM and Price Details Revealed
  • Samsung fears first mobile operating loss due to memory price surge

Quick Links

  • About OnMSFT.com
  • Contact OnMSFT
  • Join Our Team
  • Privacy Policy
© 2010–2026 OnMSFT.com LLC. All rights reserved.
About OnMSFT.comContact OnMSFTPrivacy Policy