2

Microsoft releases automated PyRIT red teaming tool for finding AI model risks

 6 months ago
source link: https://siliconangle.com/2024/02/23/microsoft-releases-automated-pyrit-red-teaming-tool-finding-ai-model-risks/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Microsoft releases automated PyRIT red teaming tool for finding AI model risks

building-1011876_1280.jpg
AI

Members of a Microsoft Corp. team tasked with using hacker tactics to find cybersecurity issues have open-sourced an internal tool, PyRIT, that can help developers find risks in their artificial intelligence models.

The researchers released the code for the framework on Thursday. According to Microsoft, PyRIT can automatically generate thousands of adversarial AI prompts to test if a neural network effectively withstands hacking attempts. The tool is geared toward processing text, but it was built in a way that allows developers to add support for other types of AI input such as images.

PyRIT started out as a collection of scripts that Microsoft’s AI Red Team developed for internal use. The team is responsible for simulating cyberattacks against new AI models to find weak points before hackers do. The researchers steadily expanded the scripts with additional features until the code base grew into the framework released this week as PyRIT.

Developers have to test a newly created AI model for several types of risks before deploying it to production. They must search for cybersecurity risks, such as prompts that might cause the model to write malware. Software teams also need to look for cases where the AI may hallucinate, as well as determine if it can be tricked into revealing sensitive information from its training dataset.

Further complicating the task is that some models generate not only text but also other types of output such as images. Vulnerability tests must be repeated separately across each output type, as well as across each software interface through which users interact with the AI. Those factors mean that thoroughly testing a neural network requires developers to craft up to thousands of adversarial prompts, which is often impractical.

Microsoft created PyRIT to remove that limitation. According to the company, the framework allows developers to specify a certain type of adversarial AI input and automatically generate thousands of prompts that meet the criteria. Those prompts can be used to test an AI that is implemented in the form of a web service, as well as models offered via an application programming interface.

“PyRIT is not a replacement for manual red teaming of generative AI systems,” Microsoft’s researchers stressed in a blog post detailing the framework. “Instead, it augments an AI red teamer’s existing domain expertise and automates the tedious tasks for them.”

PyRIT can not only generate adversarial prompts but also evaluate how the target model responds. According to Microsoft, a built-in scoring engine automatically determines whether the model a developer is testing produced harmful output in response to a prompt. Software teams have the option to swap the default scoring engine with an external neural network built for the same task.

Because it’s capable of analyzing AI responses, PyRIT lends itself to performing so-called multiturn risk evaluations. The framework can enter an adversarial prompt into an AI, analyze the response and adjust its next prompt accordingly to make it more effective. “While single-turn attack strategies are faster in computation time, multiturn red teaming allows for more realistic adversarial behavior and more advanced attack strategies,” Microsoft’s researchers explained.

Photo: efes/Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK