The Information Highway

The Information Highway

Font size: +
2 minutes reading time (467 words)

Vulnerabilities found in Microsoft Azure AI

Threat update

Significant vulnerabilities in Microsoft's Azure AI Content Safety services have been discovered. These vulnerabilities enable attackers to bypass safeguards and deploy harmful AI-generated content.

Technical Detail and Additional Info

What is the threat?

Attackers are using techniques such as 'Character Injection' and 'Adversarial ML Evasion', to exploit Azure AI Content Safety services.

  • Character Injection: A technique that involves altering text by inserting or replacing characters with specific symbols or sequences, such as diacritics, homoglyphs, numeric substitutions, space injections, or zero-width characters. Therefore, these subtle modifications can trick the model into misclassifying content, enabling attackers to influence the model's interpretation and interfere with the analysis. The objective is to bypass the guardrail by causing it to incorrectly classify the content.
  • Adversarial Machine Learning (AML): involves altering input data using specific techniques to mislead the model's predictions. These techniques include perturbations, word substitutions, misspellings, and other manipulations. By carefully choosing and modifying words, attackers can cause the model to misinterpret the intended meaning of the input.

Once the attacker bypasses both the AI Text Moderation and Prompt Shield guardrails, they can inject harmful content, manipulate the model's responses, or compromise sensitive information. This exposure challenges our perception of what it takes to create effective AI guardrails.

Why is it noteworthy?

Azure AI Content Safety is a cloud-based service designed to assist developers in establishing safety and security guardrails for AI applications by identifying and managing inappropriate content. It employs advanced techniques to filter out harmful material, including hate speech and explicit or objectionable content. Azure OpenAI leverages a large language model (LLM) equipped with Prompt Shield and AI Text Moderation guardrails to validate inputs and AI-generated content. Many people rely on Microsoft's Azure AI Content Safety service for responsible AI behavior.

However, the two security vulnerabilities found within these guardrails, which are intended to protect AI models from jailbreaks and prompt injection attacks, means that attackers can bypass both the AI Text Moderation and Prompt Shield guardrails, allowing them to inject harmful content, manipulate the model's responses, or even compromise sensitive information.

What is the exposure or risk?

These vulnerabilities mean that developers and users must be more careful of any harmful, inappropriate, or manipulated content appearing in their AI-generated outputs.

What are the recommendations?

 LBT Technology Group recommends the following actions to protect your environment against these vulnerabilities:

  • Inspect the data returned by AI models regularly, to detect and mitigate risks associated with malicious or unpredictable user prompts.
  • Establish company-wide verification mechanisms to ensure all AI models in use are legitimate and secure.
  • Use an AI Gateway to help ensure consistent security across AI workloads.

References

Palo Alto Networks warns of potential PAN-OS RCE v...
Zero-click flaw in Synology NAS devices

Related Posts

 

Comments

No comments made yet. Be the first to submit a comment
Saturday, 23 November 2024

Captcha Image

Top Breaches Of 2023

Customers Affected In T-Mobile Breach
Accounts Affected In MOVEit Breach
Customers Affected In MCNA Insurance Data Breach
Individuals Affected In PharMerica Data Breach
Users Affected In ChatGPT Major Data Breach
*Founder Shield End of Year 2023