Community Plugin

evaluating-code-models

Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics. Use when benchmarking code models, comparing coding abilities, testing multi-language support, or measuring code generation quality. Industry standard from BigCode Project used by HuggingFace leaderboards.

1.0.0

Updated 1 month ago

Capabilities

Commands

Agents

Skills

Hooks

MCP Servers

LSP Servers

Install

Add the repository(one-time)

/plugin marketplace add zechenzhangAGI/AI-research-SKILLs

Install the plugin

/plugin install zechenzhangagi-evaluating-code-models-11-evaluation-bigcode-evaluation-harness@zechenzhangAGI/AI-research-SKILLs

Component Details

No components detected in this plugin's metadata.

Stats

Stars687

Forks54

MaintenanceFair

Last Commit1 month ago

Collections

Links

View on GitHub

View README

Plugin Marketplace JSON

Similar Plugins

agent-sdk-dev

56.6k

Claude Agent SDK Development Plugin

2mo

v1.0.0

ralph-wiggum

56.6k

211

Implementation of the Ralph Wiggum technique - continuous self-referential AI loops for interactive iterative development. Run Claude in a while-true loop with the same prompt until task completion.

v1.0.0

plugin-dev

56.6k

Comprehensive toolkit for developing Claude Code plugins. Includes 7 expert skills covering hooks, MCP integration, commands, agents, and best practices. AI-assisted plugin creation and validation.

1mo

v0.1.0

pr-review-toolkit

56.6k

Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification

3mo

v1.0.0