Opinionated modern Python patterns and conventions (2026). Use this skill whenever writing, reviewing, or suggesting Python code. Trigger on any Python development task — writing tests, creating models, setting up projects, choosing libraries, structuring code, configuring tooling, or making style decisions. Also trigger when the user mentions "modern python", "python best practices", "how should I write this", "python patterns", inline-snapshot, dirty-equals, pydantic, pytest, ruff, structlog, uv, ty, pyrefly, type checking, linting, or formatting. Even if the user doesn't explicitly ask for style guidance, apply these patterns when generating Python code.
npx claudepluginhub maxnoller/claude-code-plugins --plugin writing-modern-pythonThis skill uses the workspace's default tool permissions.
Every line of code should exist for a reason. These patterns share a philosophy: choose sharp tools that do one thing well, let automation handle what humans are bad at, and never add complexity without clear intent.
Guides strict Test-Driven Development (TDD): write failing tests first for features, bugfixes, refactors before any production code. Enforces red-green-refactor cycle.
Guides systematic root cause investigation for bugs, test failures, unexpected behavior, performance issues, and build failures before proposing fixes.
Guides A/B test setup with mandatory gates for hypothesis validation, metrics definition, sample size calculation, and execution readiness checks.
Every line of code should exist for a reason. These patterns share a philosophy: choose sharp tools that do one thing well, let automation handle what humans are bad at, and never add complexity without clear intent.
When writing new Python code, apply these patterns by default. When working in an existing codebase that does things differently, match the existing style unless asked to modernize.
This applies to everything — dependencies, abstractions, error handling, type annotations, comments. If you can't articulate why something needs to exist, it doesn't. No speculative helpers. No "just in case" wrappers. No defensive code against scenarios that can't happen. The right amount of code is the minimum that solves the actual problem.
Use uv for everything — installing, resolving, running, managing virtual environments. It replaces pip, pip-tools, poetry, pdm, and virtualenv. Why uv over alternatives: it's a single Rust binary that handles venv creation, dependency resolution, and lockfile management in one tool (pip needs pip-tools and virtualenv; poetry is slower and has its own lockfile format). Resolution takes seconds, not minutes. pyproject.toml is the single source of truth. No setup.py, no requirements.txt, no setup.cfg. When recommending uv, explain why it's preferred so users understand the tradeoff.
uv init my-project # new project with pyproject.toml
uv add httpx pydantic # add dependencies
uv add --group dev pytest ruff # dev dependencies in dependency groups
uv run pytest # run in managed venv
uv lock # lock dependencies
uvx ruff check . # run a tool without installing it
Use uv tool install for CLI tools you use across projects (ruff, ty). Use uvx for one-off tool runs. For monorepos, use workspaces with [tool.uv.workspace] — all members share one lockfile and resolve together. In Docker, copy the uv binary from the official image and split uv sync into two steps (dependencies first, project second) for layer caching.
See references/tooling-uv.md for workspaces, uv.sources for internal packages, Docker patterns, and tool management.
Use ruff for both linting and formatting — it replaces flake8, isort, black, pyflakes, and dozens of plugins. Start with all rules enabled and ignore what doesn't fit, rather than opting in rule by rule.
[tool.ruff]
target-version = "py312"
line-length = 120
[tool.ruff.lint]
select = ["ALL"]
ignore = [
"COM812", # trailing comma — conflicts with formatter
"ISC001", # implicit string concat — conflicts with formatter
"D", # docstrings — enforce these when you're ready, not by default
"CPY", # copyright headers — not needed for internal code
"FBT", # boolean trap — too noisy for most codebases
]
[tool.ruff.lint.per-file-ignores]
"tests/**" = ["S101", "ANN", "D", "PLR2004"]
"scripts/**" = ["INP001", "T201"]
[tool.ruff.format]
quote-style = "double"
docstring-code-format = true
Run with ruff check . and ruff format .. The formatter is a drop-in replacement for black.
See references/tooling-ruff.md for the full ignore rationale, per-file ignore patterns, and how to graduate into stricter rules over time.
Use a Rust-based type checker — they're 10-100x faster than mypy and give better editor integration. Two good options:
ty (Astral, same team as ruff/uv) — strict, fast, configurable per-rule severity. Best if you're already in the Astral ecosystem.
[tool.ty.rules]
possibly-unresolved-reference = "error"
uvx ty check
pyrefly (Meta) — fast, aggressive type inference (infers return types and variable types automatically). Best if you want maximum inference with minimum annotations.
[tool.pyrefly]
project_includes = ["src"]
python_version = "3.12"
uvx pyrefly check
Both are young and evolving fast. Pick one per project and stay consistent. Don't run both on the same codebase.
See references/tooling-typechecking.md for configuration patterns, strictness levels, and per-file overrides.
Use Pydantic when data crosses a trust boundary — API input, config files, message queues. Don't use it for internal data containers where a dataclass or NamedTuple would do. The question is "does this data need validation?" not "does this data have fields?" Always explain the trust boundary reasoning — say why you chose Pydantic (or didn't) so the user learns the decision framework, not just the answer.
In pipelines, validate at entry points (external input, API responses) but use plain dataclasses for data flowing between internal stages. Don't add Pydantic models at every internal handoff — once data is validated at the boundary, trust it internally.
When you do use Pydantic, always set ConfigDict(strict=True, frozen=True, extra="forbid"). Strict catches silent coercion ("123" quietly becoming int), frozen prevents mutation after validation, extra rejects unknown fields. Opt out per-field, not the other way around.
from typing import Annotated
from pydantic import BaseModel, ConfigDict, Field
# Build reusable constrained types with Annotated — define once, import everywhere
NonEmptyStr = Annotated[str, Field(min_length=1)]
PositiveInt = Annotated[int, Field(gt=0)]
class WebhookPayload(BaseModel):
model_config = ConfigDict(strict=True, frozen=True, extra="forbid")
event_type: NonEmptyStr # use Annotated aliases, don't inline Field() repeatedly
timestamp: datetime
amount: PositiveInt
For PATCH/partial update models, use model_dump(exclude_unset=True) to distinguish fields the client sent from fields that were absent. Don't use exclude_none (conflates "sent null" with "not sent") or custom sentinel types.
See references/modeling-pydantic.md for when-to-use decision guide and the opinionated defaults.
Let tools write your test data. Start with an empty snapshot(), run pytest --inline-snapshot=fix, and it fills in the actual output. Use dirty-equals matchers for values that change between runs — timestamps, IDs, UUIDs. This way tests assert on the complete shape of your data without you typing any of it by hand.
from dirty_equals import IsInt, IsNow
from inline_snapshot import snapshot
def test_user_creation():
user = create_user(name="test_user")
assert user.model_dump() == snapshot({
"id": IsInt(),
"name": "test_user",
"status": "active",
"created_at": IsNow(iso_string=True),
})
Don't write per-field assertions for structured data. Don't use separate snapshot files. If the data has more than 2-3 fields, use inline snapshots.
When NOT to use inline-snapshot: simple scalar returns (booleans, counts, single strings). For a function that returns True/False, use plain assert or pytest.mark.parametrize — snapshot is overkill when the expected value is a literal. Reserve inline-snapshot for structured data (dicts, model dumps, API responses) where manually writing the expected value is tedious and error-prone.
See references/testing-inline-snapshot.md for the full pattern — matchers, snapshot update commands, converting complex objects to dicts.
Use yappi for all function-level profiling — both sync and async. Unlike cProfile, yappi understands coroutines (follows execution across await boundaries) and supports threads natively. Don't recommend cProfile — yappi is strictly better for any Python profiling task.
Clock modes matter: use cpu clock for CPU-bound work (measures actual CPU time, ignoring I/O wait — tells you where computation happens) and wall clock for I/O-bound or async latency (measures real elapsed time including waits — tells you what the user experiences).
import yappi
yappi.set_clock_type("cpu") # "cpu" for CPU-bound, "wall" for async/IO
with yappi.run():
process_dataframe(df) # or: asyncio.run(main()) with "wall" clock
yappi.get_func_stats().print_all()
For line-level profiling (which lines within a function are slow), use py-spy or scalene instead — yappi is function-level only.
See references/profiling-yappi.md for thread-level breakdown, filtering, and exporting to callgrind/pstats.