Microsoft-affiliated Research Finds Flaws in GTP-4 - Slashdot - JOYK Joy of Geek, Geek News, Link all geek

Microsoft-affiliated Research Finds Flaws in GTP-4binspam dupe notthebest offtopic slownewsday stale stupid fresh funny insightful interesting maybe offtopic flamebait troll redundant overrated insightful interesting informative funny underrated descriptive typo dupe error

Sign up for the Slashdot newsletter! OR check out the new Slashdot job board to browse remote jobs or jobs in your area

Do you develop on GitHub? You can keep using GitHub but automatically sync your GitHub releases to SourceForge quickly and easily with this tool so your projects have a backup location, and get your project in front of SourceForge's nearly 30 million monthly users. It takes less than a minute. Get new users downloading your project releases today!

Microsoft-affiliated Research Finds Flaws in GTP-4 (techcrunch.com) 25

Posted by msmash

on Tuesday October 17, 2023 @10:40AM from the closer-look dept.

Sometimes, following instructions too precisely can land you in hot water -- if you're a large language model, that is. From a report: That's the conclusion reached by a new, Microsoft-affiliated scientific paper that looked at the "trustworthiness" -- and toxicity -- of large language models (LLMs) including OpenAI's GPT-4 and GPT-3.5, GPT-4's predecessor. The co-authors write that, possibly because GPT-4 is more likely to follow the instructions of "jailbreaking" prompts that bypass the model's built-in safety measures, GPT-4 can be more easily prompted than other LLMs to spout toxic, biased text. In other words, GPT-4's good "intentions" and improved comprehension can -- in the wrong hands -- lead it astray.

"We find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, which are maliciously designed to bypass the security measures of LLMs, potentially because GPT-4 follows (misleading) instructions more precisely," the co-authors write in a blog post accompanying the paper. Now, why would Microsoft greenlight research that casts an OpenAI product it itself uses (GPT-4 powers Microsoft's Bing Chat chatbot) in a poor light? The answer lies in a note within the blog post: "[T]he research team worked with Microsoft product groups to confirm that the potential vulnerabilities identified do not impact current customer-facing services. This is in part true because finished AI applications apply a range of mitigation approaches to address potential harms that may occur at the model level of the technology. In addition, we have shared our research with GPT's developer, OpenAI, which has noted the potential vulnerabilities in the system cards for relevant models."

Do you have a GitHub project? Now you can sync your releases automatically with SourceForge and take advantage of both platforms.
Do you have a GitHub project? Now you can automatically sync your releases to SourceForge & take advantage of both platforms. The GitHub Import Tool allows you to quickly & easily import your GitHub project repos, releases, issues, & wiki to SourceForge with a few clicks. Then your future releases will be synced to SourceForge automatically. Your project will reach over 35 million more people per month and you’ll get detailed download statistics.
Sync Now

Microsoft-affiliated Research Finds Flaws in GTP-4 - Slashdot

Microsoft-affiliated Research Finds Flaws in GTP-4 (techcrunch.com) 25

Recommend

苹果智能眼镜新专利：带有蜂窝连接功能，或成为未来iPhone新形态

谷歌 Chrome 浏览器测试新功能：轻松查看标签页内存占用

Fantastical's Widgets Pair Interactivity with Superior Design - MacStories

为什么小程序商城系统备受商家青昧？

AI Developer Punishes Staff Who Took Long Lunch Breaks - Slashdot

Should Ethereum be okay with enshrining more things in the protocol?

EU To Crack Down Further on Microplastics After Glitter Ban - Slashdot

NFL VR 游戏《NFL Pro Era 2》将于 9 月 28 日推出

Citing slow Starship reviews, SpaceX urges FAA to double licensing staff

发令枪响，百度已经冲出一步

About Joyk