Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By Mapika
Catch and fix LLM 'style smells' in non-fiction prose — quality signals, not an AI-detection verdict.
npx claudepluginhub mapika/claude-plugins --plugin plainlyUse this agent when the user wants an AI-sounding non-fiction draft rewritten into clean human prose, not just diagnosed. Typical triggers include "deslop this", "fix the AI tells in this draft", "make this read more human", and "rewrite this so it doesn't sound like ChatGPT". See "When to invoke" in the agent body for worked scenarios.
Use this agent to blindly compare two versions of a piece of non-fiction prose on the same topic and decide which reads more like a human genuinely wrote it. Typically dispatched by the deslopper agent after editing, to confirm a rewrite actually reads more human — not just that its metrics moved. The caller passes both texts in randomized A/B order; this agent never sees which is the edit or any metrics.
Use this agent to check a piece of non-fiction prose against the plainly "constitution" of LLM writing tells and return the semantic ones a regex cannot catch — manufactured antithesis (including contractions like "isn't just X — it's Y"), inflated boosters, pervasive hedging, copula-avoidance, formulaic intros/conclusions, and vague abstraction. Typically dispatched by the /plainly:check command and the deslopper agent as their cheap semantic-matching pass. See "When to invoke" in the body.
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Complete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use
Efficient skill management system with progressive discovery — 410+ production-ready skills across 33+ domains
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
AI taste judge for Claude Code. Your code. Your taste.
Generate matplotlib figures in a soft-pastel, warm-earth, research-blog visual register — bold sans-serif display titles, scatter overlaid with smoothed trends and shaded confidence bands, signature rounded bars, minimal axes, and '↓better' badges. White background by default (conference-ready), with an opt-in warm cream blog background. Ships 16 chart archetypes and a dependency-light style helper.
Read research papers with strict per-claim citations. Fetches PDF + tex source + real BibTeX from open sources (OpenAlex, Crossref, DBLP, ACL, Semantic Scholar, arXiv). Never fabricates citations.
My AI first drafts all had a tell. The same few words (delve, robust, leverage), the "it's not just X, it's Y" reflex, the tidy list of three. I got tired of deleting it by hand, so I wrote plainly.
It's a Claude Code plugin. It scores non-fiction prose for the habits that make it read as machine-written, and offers to fix them. It is not an AI detector and I don't pitch it as one. These are just bad writing habits, whoever has them. Works on essays, docs, blog posts, marketing copy. Python 3.11+, no dependencies.
claude plugin marketplace add Mapika/plainly
claude plugin install plainly@plainly
That gives you the /plainly:check command, the deslopper agent, and a writing skill.
To run the engine on a file without the plugin:
python3 plainly/scripts/prescan.py draft.md
It watches on its own. A hook runs after Claude writes or edits any .md, .txt, or
.rst file. If the result reads AI-generated, plainly says so and offers to clean it.
Clean prose and code get no comment. Tune or switch it off in .plainly.toml.
/plainly:check draft.md prints findings by severity, each with the line, the bad
span, a reason, and a rewrite. It reads the file. It never edits.
python3 plainly/scripts/prescan.py draft.md --json # the raw scan
density is the number to watch: tell-weight per 100 words. Human prose sits near 0.
Slop runs into the teens.
The deslopper agent edits the flagged spans and hands the text back. Just say
"deslop draft.md". Two things matter here. It works a paragraph at a time and reverts any
edit that adds a new tell, fails to lower the score, or flattens the rhythm, so the text
can't come out worse than it went in. And it finishes with a before/after scorecard plus a
blind judge: a cheap subagent that reads both versions in random order and picks the more
human one without seeing the scores. It only claims a win when both agree.
The writing-clean-prose skill loads itself when Claude writes prose for you, so the
tells don't show up to begin with.
In CI:
python3 plainly/scripts/prescan.py --diff --fail-over 4 # fail if a changed .md scores over 4
Drop a .plainly.toml in your repo root.
| Section | Key | Meaning |
|---|---|---|
[severity] | critical, moderate | Cluster weight per paragraph to reach each tier. |
[rules] | em_dash | The em-dash check. Off by default (noisy). |
[burstiness] | min_cv | Flag docs whose sentence-length variation drops below this. |
[concreteness] | min_mean | Flag paragraphs below this mean concreteness (1=abstract, 5=concrete). |
[genre] | default | prose, docs, or marketing. Sets how hard marketing tells count. |
[allow] | terms | Words never flagged (say, a product named "Tapestry"). |
[deslop] | judge | Run the blind judge after editing. On by default. |
[deslop] | burstiness_tolerance | A rewrite's cv must stay at or above the original's times this (default 0.9). |
A stdlib engine does the counting: structural patterns (participle tails, "not just X,
it's Y", tricolons, announcement clichés, puffery), a weighted word list of measured AI
vocabulary and marketing buzzwords, evidence-backed emoji, sentence-length variation, and
paragraph concreteness against the Brysbaert lexicon. Every entry is sourced in
SOURCES.md. The [genre] setting decides how hard
marketing tells count, so a punchy word in a technical doc isn't scored like a LinkedIn
post. Then a haiku subagent catches the semantic tells a regex misses, with no API key
because it runs inside Claude Code. Weights are low. The score is about how tells cluster,
not any single word.
I measured it instead of guessing. Full writeup in eval/.
Against 2022-era AI it separates human from machine well: AUC 0.82, 4.3% false positives, best on scientific writing. Against 2026 frontier models it drops to 0.67, because the obvious tells are fading. That is the honest result, and the reason I call it a quality linter and not a detector. The fixes hold up though: a blind judge outside the test set preferred the deslopped version 95% of the time, and plainly's score always moved the same direction.
The eval also caught the tool lying to me. On fresh frontier prose the score went flat. A textbook LinkedIn post ("thrilled to share", "double down", "the best is yet to come") scored 0.00. So I dug up the actual evidence and taught it the marketing register. Same post, before and after:
Python 3.11+. No third-party packages.
MIT.