Community Plugin

post-training

RLHF and preference alignment. Includes TRL (SFT, DPO, PPO, GRPO), GRPO (Group Relative Policy Optimization), OpenRLHF (Ray+vLLM acceleration), and SimPO (reference-free alignment). Use when aligning models with human preferences or training reward models.

1.0.0

Updated 1 month ago

Capabilities

Commands

Agents

Skills

Hooks

MCP Servers

LSP Servers

Install

Add the repository(one-time)

/plugin marketplace add zechenzhangAGI/AI-research-SKILLs

Install the plugin

/plugin install post-training@ai-research-skills

Component Details

No components detected in this plugin's metadata.

Stats

Stars746

Forks59

MaintenanceGood

Last Commit1 month ago

Collections

Links

View on GitHub

View README

Plugin Marketplace JSON

Similar Plugins

cache-components

137.2k

Expert guidance for Next.js Cache Components and Partial Prerendering (PPR). Proactively activates in projects with cacheComponents enabled.

v1.0.0

explanatory-output-style

57.0k

Adds educational insights about implementation choices and codebase patterns (mimics the deprecated Explanatory output style)

2mo

v1.0.0

hookify

57.0k

Easily create hooks to prevent unwanted behaviors by analyzing conversation patterns

1mo

v0.1.0

frontend-design

57.0k

109

Frontend design skill for UI/UX implementation

2mo

v1.0.0