Detoxify
astra_rl.moderators.detoxify
¶
detoxify.py Moderator to call into the Detoxify engine.
DetoxifyModerator
¶
Bases: Moderator[str, str]
Moderator that wraps the Detoxify library for toxicity detection.
https://github.com/unitaryai/detoxify
Attributes:
Name | Type | Description |
---|---|---|
harm_category |
str
|
The category of harm to detect (default is "toxicity"); see below. |
variant |
str
|
The variant of the Detoxify model to use (default is "original"). |
Notes
Possible harm categories include "toxicity", "severe_toxicity", "obscene", "identity_attack", "insult", "threat", "sexual_explicit".
Possible variants Include "original", "multilingual", "unbiased".
Source code in src/astra_rl/moderators/detoxify.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|