Toxicity Operations

Toxicity | Detection by Text

The Toxicity | Detection operation is useful to classify and score any harmful content by the user or the LLM.

Input Configuration

Module Configuration

This refers to the MuleSoft Inference LLM Configuration set up in the Getting Started section.

General Operation Fields

Text: The input to be checked on harmful content.

XML Configuration

Below is the XML configuration for this operation:

		<mac-inference:toxicity-detection-text 
    doc:name="Toxicity detection text" 
    doc:id="b5770a5b-d3f9-47ba-acec-ab0bd41e4188" 
    config-ref="ModerationMistral">
			<mac-inference:text >
        <![CDATA[#[payload.prompt]]]>
      </mac-inference:text>
		</mac-inference:toxicity-detection-text>

Output Configuration

Response Payload

This operation responds with a json payload containing the toxicity detection and rating.

Example Response Payload

{
  "payload": {
    "flagged": true,
    "categories": [
      {
        "illicit/violent": 0.0000025466403947055455,
        "self-harm/instructions": 0.00023480495744356635,
        "harassment": 0.9798945372458964,
        "violence/graphic": 0.000005920916517463734,
        "illicit": 0.000013552078562406772,
        "self-harm/intent": 0.0002233150331012493,
        "hate/threatening": 0.0000012029639084557005,
        "sexual/minors": 0.0000024300240743279605,
        "harassment/threatening": 0.0007499928075102617,
        "hate": 0.00720390551996062,
        "self-harm": 0.0004822186797755494,
        "sexual": 0.00012644219446392274,
        "violence": 0.0004960569708019355
      }
    ]
  }
}

Example Use Cases

This operation can be particularly useful in various scenarios, such as:

Detecting Toxic Inputs: Detect and block toxic input by the user to be send to the LLM.
Detecting Harmful Responses: Filter out toxic LLM response that could be harmful to the user.

Tools MuleSoft Vectors