Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.
/plugin marketplace add zechenzhangAGI/AI-research-SKILLs/plugin install llamaguard@zechenzhangAGI/AI-research-SKILLs