Identify Text Toxicity Using TensorFlow.js | by Kevin Hermawan | Apr, 2024


Online interactions can sometimes include harmful or toxic text that may negatively impact user experience and safety. Effective content moderation is essential to address this problem. This article explores a practical approach to identifying toxic text using TensorFlow.js and its pre-trained model.

TensorFlow.js is a JavaScript library that facilitates the training and deployment of machine learning models directly in the browser or on the server side with Node.js.

To implement toxicity identification, you need to install the necessary TensorFlow.js packages. The installation process varies depending on your environment:

Node.js

For server-side applications using Node.js:

npm install @tensorflow/tfjs-node @tensorflow-models/toxicity

Web Browser

To deploy the model directly in web browsers:

npm install @tensorflow/tfjs @tensorflow-models/toxicity

GPU Acceleration

For optimized performance with GPU acceleration:

npm install @tensorflow/tfjs-node-gpu @tensorflow-models/toxicity

After setting up the necessary packages, let’s dive into the code needed to implement toxicity identification.

Import the Required Libraries

Configure TensorFlow.js for your specific environment. Here, we are setting up for a Node.js environment, but adjust accordingly if you’re using TensorFlow.js in a browser.

// Adjust based on environment: tfjs-node for Node.js
import "@tensorflow/tfjs-node";

// Import pre-trained toxicity model
import toxicity from "@tensorflow-models/toxicity";

Set Up Model Parameters

Define the threshold for toxicity identification and specify the categories of toxic behavior you wish to identify. A higher threshold ensures that only highly confident identifications are considered.

const threshold = 0.9; // Text is considered toxic if confidence is 90% or higher

const toxicityLabels = [
"identity_attack", "insult", "obscene", "severe_toxicity",
"sexual_explicit", "threat", "toxicity"
];

More information about how the toxicity labels were calibrated can be found here.

Load the Model and Classify the Text

Create an asynchronous function to load the model with the defined settings and then classify the provided text.

async function predict(sentence: string) {
// Load the toxicity model with the defined threshold and labels
const model = await toxicity.load(threshold, toxicityLabels);

// Classify the given sentence using the loaded model
const results = await model.classify(sentence);

return results;
}

Analyze Sample Text

Test the model by analyzing a specific piece of text and then display the results. This example uses the text “You’re useless.” to demonstrate how the model evaluates and categorizes text.

const text = "You're useless.";

predict(text).then(predictions => {
// Format the output to enhance readability
const data = predictions.flatMap(prediction =>
prediction.results.map(result => ({
Label: prediction.label,
Probabilities: Array.from(result.probabilities).join(", "),
Match: result.match
}))
);

console.log("Analysis for the text:", text);
console.table(data);
});

Note: Each match is either true or false. If the confidence level does not exceed the threshold, match is set to null.

When analyzing the text “You’re useless.”, the model provides the following results:

Analysis for the text: You're useless.
┌───┬─────────────────┬────────────────────────────────────────────┬───────┐
│ │ Label │ Probabilities │ Match │
├───┼─────────────────┼────────────────────────────────────────────┼───────┤
│ 0 │ identity_attack │ 0.9979670643806458, 0.0020329488907009363 │ false │
│ 1 │ insult │ 0.022712312638759613, 0.977287769317627 │ true │
│ 2 │ obscene │ 0.9985853433609009, 0.0014147203182801604 │ false │
│ 3 │ severe_toxicity │ 0.9999992847442627, 6.91593925239431e-7 │ false │
│ 4 │ sexual_explicit │ 0.9996801614761353, 0.00031982286600396037 │ false │
│ 5 │ threat │ 0.9995089769363403, 0.0004910563002340496 │ false │
│ 6 │ toxicity │ 0.019180580973625183, 0.9808194041252136 │ true │
└───┴─────────────────┴────────────────────────────────────────────┴───────┘

This result reveals that the text is identified as an “insult” with a confidence of approximately 97.73% and as “toxicity” with a confidence of approximately 98.08%.

In this article, we explored the practical application of TensorFlow.js for identifying toxicity in text. We demonstrated how to utilize its pre-trained model to effectively identify various types of toxicity. It’s important to note that machine learning models may have limitations in accuracy, especially in complex situations.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*