Priority
P2-High
OS type
Ubuntu
Hardware type
Gaudi2
Running nodes
Single Node
Description
Toxicity detection plays a critical role in guarding the inputs and outputs of large language models (LLMs) to ensure safe, respectful, and responsible content. Given the widespread use of LLMs in applications like customer service, education, and social media, there's a significant risk that they could inadvertently produce or amplify harmful language if toxicity is not detected effectively. Many SLMs and LLMs have also been tuned as guardrails to detect toxicity but have varying taxonomies and definitions of toxicity. This Toxicity Detection Evaluation script intends to measure how well an LLM can detect toxicity across many popular toxic language datasets, regardless of the taxonomy it was tuned on, by employing the most commonly used metrics in toxicity classification to provide a comprehensive assessment.
Supported toxicity datasets
Supported Metrics
- accuracy
- auprc (area under precision recall curve)
- auroc
- f1
- fpr (false positive rate)
- precision
- recall
Supported Models
@ashahba @mitalipo @qgao007
Priority
P2-High
OS type
Ubuntu
Hardware type
Gaudi2
Running nodes
Single Node
Description
Toxicity detection plays a critical role in guarding the inputs and outputs of large language models (LLMs) to ensure safe, respectful, and responsible content. Given the widespread use of LLMs in applications like customer service, education, and social media, there's a significant risk that they could inadvertently produce or amplify harmful language if toxicity is not detected effectively. Many SLMs and LLMs have also been tuned as guardrails to detect toxicity but have varying taxonomies and definitions of toxicity. This
Toxicity Detection Evaluationscript intends to measure how well an LLM can detect toxicity across many popular toxic language datasets, regardless of the taxonomy it was tuned on, by employing the most commonly used metrics in toxicity classification to provide a comprehensive assessment.Supported toxicity datasets
Supported Metrics
Supported Models
@ashahba @mitalipo @qgao007