Skip to content
OnMSFT.com
  • Home
  • About
  • Contact
  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Edge
  • Teams
  • Gaming
Menu
  • Home
  • About
  • Contact
  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Edge
  • Teams
  • Gaming
  1. Home
  2. News
  3. Microsoft-affiliated study reveals vulnerabilities and toxicity risks in GPT-4

Microsoft-affiliated study reveals vulnerabilities and toxicity risks in GPT-4

OnMSFT Staff OnMSFT Staff
October 17, 2023
2 min read

A recent scientific paper co-authored by Microsoft researchers scrutinized the “trustworthiness” and potential toxicity of large language models (LLMs), specifically focusing on OpenAI’s GPT-4 and its predecessor, GPT-3.5.

The research team found that GPT-4, although generally more reliable than GPT-3.5 in standard benchmarks, is more susceptible to “jailbreaking” prompts that bypass the model’s safety measures. These prompts can lead GPT-4 astray, following misleading instructions more precisely and generating harmful content.

The co-authors’ blog post accompanying the paper states, “We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, which are maliciously designed to bypass the security measures of LLMs, potentially because GPT-4 follows (misleading) instructions more precisely.”

Surprisingly, Microsoft’s involvement in this research, which appears to cast OpenAI’s GPT-4 in a negative light, can be attributed to a collaboration with Microsoft product groups to confirm that potential vulnerabilities do not impact customer-facing services.

It’s important to note that the research team worked with Microsoft product groups to confirm that the potential vulnerabilities identified do not impact current customer-facing services. This is in part true because finished AI applications apply a range of mitigation approaches to address potential harms that may occur at the model level of the technology.

The blog post assures that mitigation approaches are in place to address potential harms at the model level, and OpenAI has been made aware of the vulnerabilities identified in the system.

Furthermore, the study revealed that GPT-4 was more prone to leaking private and sensitive data, including email addresses, compared to other LLMs.

As the scientific community continues to explore the capabilities of LLMs, ensuring their ethical and responsible deployment remains a critical priority for the industry. Let us know your views on this in the comments section below.

Related

Share this article:
Previous Article New ‘Meeting Engagement Information’ now available on Microsoft Teams Next Article Xbox Game Pass comers and goers for the second half of October

Related Articles

Microsoft teases Xbox Helix, a powerful next-gen console with PC game support

Xbox Project Helix Has Been in the Works for Nearly a Decade

March 10, 2026

People Leaving ChatGPT for Claude Are Noticing Big Differences

March 10, 2026

OpenAI Acquires Promptfoo to Boost AI Agent Security

March 10, 2026

Leave a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Xbox Project Helix Has Been in the Works for Nearly a Decade
  • People Leaving ChatGPT for Claude Are Noticing Big Differences
  • OpenAI Acquires Promptfoo to Boost AI Agent Security
  • Anthropic adds ‘Code Review’ tool to inspect Claude Code pull requests
  • OpenAI and Google Employees File Brief Supporting Anthropic in DOD Case

Recent Comments

No comments to show.
OnMSFT.com

OnMSFT.com covers Microsoft news, reviews, and how-to guides. Formerly known as WinBeta, we have been your source for Microsoft news since 1998.

Categories

  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Gaming
  • Edge
  • Teams

Recent Posts

  • Xbox Project Helix Has Been in the Works for Nearly a Decade
  • People Leaving ChatGPT for Claude Are Noticing Big Differences
  • OpenAI Acquires Promptfoo to Boost AI Agent Security
  • Anthropic adds 'Code Review' tool to inspect Claude Code pull requests
  • OpenAI and Google Employees File Brief Supporting Anthropic in DOD Case

Quick Links

  • About OnMSFT.com
  • Contact OnMSFT
  • Join Our Team
  • Privacy Policy
© 2010–2026 OnMSFT.com LLC. All rights reserved.
About OnMSFT.comContact OnMSFTPrivacy Policy