Skip to content
OnMSFT.com
  • Home
  • About
  • Contact
  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Edge
  • Teams
  • Gaming
Menu
  • Home
  • About
  • Contact
  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Edge
  • Teams
  • Gaming
  1. Home
  2. News
  3. Microsoft-affiliated study reveals vulnerabilities and toxicity risks in GPT-4

Microsoft-affiliated study reveals vulnerabilities and toxicity risks in GPT-4

OnMSFT Staff OnMSFT Staff
October 17, 2023
2 min read

A recent scientific paper co-authored by Microsoft researchers scrutinized the “trustworthiness” and potential toxicity of large language models (LLMs), specifically focusing on OpenAI’s GPT-4 and its predecessor, GPT-3.5.

The research team found that GPT-4, although generally more reliable than GPT-3.5 in standard benchmarks, is more susceptible to “jailbreaking” prompts that bypass the model’s safety measures. These prompts can lead GPT-4 astray, following misleading instructions more precisely and generating harmful content.

The co-authors’ blog post accompanying the paper states, “We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, which are maliciously designed to bypass the security measures of LLMs, potentially because GPT-4 follows (misleading) instructions more precisely.”

Surprisingly, Microsoft’s involvement in this research, which appears to cast OpenAI’s GPT-4 in a negative light, can be attributed to a collaboration with Microsoft product groups to confirm that potential vulnerabilities do not impact customer-facing services.

It’s important to note that the research team worked with Microsoft product groups to confirm that the potential vulnerabilities identified do not impact current customer-facing services. This is in part true because finished AI applications apply a range of mitigation approaches to address potential harms that may occur at the model level of the technology.

The blog post assures that mitigation approaches are in place to address potential harms at the model level, and OpenAI has been made aware of the vulnerabilities identified in the system.

Furthermore, the study revealed that GPT-4 was more prone to leaking private and sensitive data, including email addresses, compared to other LLMs.

As the scientific community continues to explore the capabilities of LLMs, ensuring their ethical and responsible deployment remains a critical priority for the industry. Let us know your views on this in the comments section below.

Related

Share this article:
Previous Article New ‘Meeting Engagement Information’ now available on Microsoft Teams Next Article Xbox Game Pass comers and goers for the second half of October

Related Articles

NVIDIA Confirms DLSS 5 Uses 2D Frames and Motion Vectors, Not Full 3D Data

March 20, 2026
Google Stitch introduces vibe designing to create app UI with AI using text or voice and preview interactive flows instantly.

Google Stitch “vibe designing” lets you create UI with voice or text

March 20, 2026
Dictionary Publisher Files Copyright Lawsuit Against OpenAI

OpenAI Plans One Superapp to Combine ChatGPT, Codex, and Browser

March 20, 2026

Leave a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • NVIDIA Confirms DLSS 5 Uses 2D Frames and Motion Vectors, Not Full 3D Data
  • Google Stitch “vibe designing” lets you create UI with voice or text
  • OpenAI Plans One Superapp to Combine ChatGPT, Codex, and Browser
  • Meta Shifts Content Moderation to AI, Cuts Third-Party Review
  • Rivian delays 2027 profitability target due to rising autonomy costs

Recent Comments

No comments to show.
OnMSFT.com

The Tech News Site

Categories

  • Windows
  • Surface
  • Xbox
  • How-To
  • OnPodcast
  • Gaming
  • Edge
  • Teams

Recent Posts

  • NVIDIA Confirms DLSS 5 Uses 2D Frames and Motion Vectors, Not Full 3D Data
  • Google Stitch “vibe designing” lets you create UI with voice or text
  • OpenAI Plans One Superapp to Combine ChatGPT, Codex, and Browser
  • Meta Shifts Content Moderation to AI, Cuts Third-Party Review
  • Rivian delays 2027 profitability target due to rising autonomy costs

Quick Links

  • About OnMSFT.com
  • Contact OnMSFT
  • Join Our Team
  • Privacy Policy
© 2010–2026 OnMSFT.com LLC. All rights reserved.
About OnMSFT.comContact OnMSFTPrivacy Policy