Skill

coding-guidelines-python

From bms

Apply and enforce Python-specific coding standards. Use alongside coding-guidelines for any Python file — covers typing, Pyright, dataclasses, enums, abstract base classes, mutable defaults, and module-level state. Trigger on any Python work session, PR review, or when the user asks to "follow Python guidelines", "check Python style", or "enforce Python standards".

npx claudepluginhub bmsuisse/skills --plugin writing

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/bms:coding-guidelines-python

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Every function has parameter types and a return type. Use `from __future__ import annotations`

Supporting Files

evals/evals.jsonreferences/references.md

SKILL.md

687 lines · ~4.7k tokens

Similar Skills

karpathy-guidelines

168.3k

Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.

andrej-karpathy-skills

skill-lookup

163.4k

Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.

prompts.chat

debugging-and-error-recovery

52.4k

Guides systematic root-cause debugging when tests fail, builds break, or unexpected errors occur. Provides a structured triage checklist to preserve evidence, localize, and fix issues instead of guessing.

agent-skills

Stats

LanguagePython

Stars2

MaintenanceExcellent

Last CommitJun 5, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Python Coding Guidelines

Typing

Every function has parameter types and a return type. Use from __future__ import annotations at the top of every file. Prefer X | None over Optional[X]. Use generics where appropriate (list[str], dict[str, int]).

from __future__ import annotations

def transform(items: list[str], limit: int) -> list[str]:
    return items[:limit]

Type checking — ty

Run ty before every commit. Zero errors is the bar — never suppress a check to make a deadline. Fix the root cause.

uv run ty check

ty is Astral's fast Python type checker (same ecosystem as uv and ruff). Install: uv add --dev ty

Tests

Write tests before committing. Run with:

uv run pytest

Abstract base classes

from abc import ABC, abstractmethod

class Processor(ABC):
    @abstractmethod
    def process(self, data: str) -> str: ...

No mutable default arguments

Mutable defaults are shared across all calls — a classic Python trap.

# ❌
def append(item: str, items: list[str] = []) -> list[str]:
    items.append(item)
    return items

# ✅
def append(item: str, items: list[str] | None = None) -> list[str]:
    result = items or []
    result.append(item)
    return result

Dataclasses over raw dicts

dict[str, Any] as a data carrier is untyped and opaque. Give every data shape an explicit type so ty can check it.

@dataclass — when you own the object and instantiate it yourself
TypedDict — when the data is dict-shaped from an external source (JSON, DB row, config) and something downstream expects a plain dict
pydantic.BaseModel — when you need runtime validation (API request bodies, config loading, user input)

# ❌
def process(data: dict[str, Any]) -> dict[str, Any]: ...

# ✅ own object
@dataclass
class Report:
    id: str
    title: str

def process(report: Report) -> Report: ...

# ✅ external dict shape (e.g. parsed JSON)
class ReportPayload(TypedDict):
    id: str
    title: str

Enums for string constants

Raw string literals in conditions are fragile and invisible to the type checker. Use Literal for a closed set of values, or Enum when you need iteration or methods.

# ❌
def export(format: str) -> bytes:
    if format == "pdf": ...

# ✅
from typing import Literal
Format = Literal["pdf", "csv"]

def export(format: Format) -> bytes: ...

No module-level mutable state

Module-level variables that get mutated are hidden global state. They make testing hard and introduce subtle ordering bugs. Pass state explicitly or use a class.

# ❌
_cache: dict[str, str] = {}

def get(key: str) -> str:
    return _cache[key]

# ✅
class Cache:
    def __init__(self) -> None:
        self._store: dict[str, str] = {}

    def get(self, key: str) -> str:
        return self._store[key]

References

Authoritative sources for the principles above are in references/references.md. Load it when you need to cite or explain the reasoning behind a guideline.

Pre-commit checklist (Python)

All function parameters and return types annotated
from __future__ import annotations at the top of every file
uv run ty check passes with zero errors
uv run pytest passes
No mutable default arguments
No dict[str, Any] as a data carrier — use a dataclass or TypedDict
No raw string constants in conditions — use Literal or Enum
No module-level mutable variables

Examples

Real-world code examples demonstrating the four principles. Each example shows what LLMs commonly do wrong and how to fix it.

1. Think Before Coding

Example 1: Hidden Assumptions

User Request: "Add a feature to export user data"

❌ What LLMs Do (Wrong Assumptions)

def export_users(format='json'):
    """Export all users to a file."""
    users = User.query.all()
    
    if format == 'json':
        with open('users.json', 'w') as f:
            json.dump([u.to_dict() for u in users], f)
    elif format == 'csv':
        with open('users.csv', 'w') as f:
            writer = csv.DictWriter(f, fieldnames=['id', 'email', 'name'])
            writer.writeheader()
            writer.writerows([u.to_dict() for u in users])
    
    return f"Exported {len(users)} users"

Problems:

Assumed it should export ALL users (what about pagination? privacy?)
Assumed file location without asking
Assumed which fields to include
Assumed CSV fieldnames without checking actual data structure

✅ What Should Happen (Surface Assumptions)

Before implementing, I need to clarify:

1. **Scope**: Export all users or filtered subset? (privacy implications)
2. **Format**: You mentioned "export" - did you mean:
   - Download file in browser?
   - Background job with email notification?
   - API endpoint returning data?
3. **Fields**: Which user fields? (some might be sensitive)
4. **Volume**: How many users typically? (affects approach)

Simplest approach: Add an API endpoint that returns paginated JSON.
Would need more info for file-based exports.

What's your preference?

Example 2: Multiple Interpretations

User Request: "Make the search faster"

❌ What LLMs Do (Pick Silently)

# Adds caching, database indexes, and async processing
# without asking which "faster" matters
@lru_cache(maxsize=1000)
async def search(query: str) -> List[Result]:
    # ... 200 lines of optimization code

✅ What Should Happen

"Make the search faster" could mean:

1. **Faster response time** (< 100ms instead of 500ms)
   - Add database indexes
   - Cache frequent queries
   - Estimated effort: 2 hours

2. **Handle more concurrent searches** (throughput)
   - Add async/await
   - Connection pooling
   - Estimated effort: 4 hours

3. **Faster perceived speed** (UX)
   - Show partial results immediately
   - Progressive loading
   - Estimated effort: 3 hours

Current search takes ~500ms for typical queries. 
Which aspect matters most for your use case?

2. Simplicity First

Example 1: Over-abstraction

User Request: "Add a function to calculate discount"

❌ What LLMs Do (Overengineered)

from abc import ABC, abstractmethod
from enum import Enum
from typing import Protocol, Union
from dataclasses import dataclass

class DiscountStrategy(ABC):
    @abstractmethod
    def calculate(self, amount: float) -> float:
        pass

class PercentageDiscount(DiscountStrategy):
    def __init__(self, percentage: float):
        self.percentage = percentage
    
    def calculate(self, amount: float) -> float:
        return amount * (self.percentage / 100)

class FixedDiscount(DiscountStrategy):
    def __init__(self, fixed_amount: float):
        self.fixed_amount = fixed_amount
    
    def calculate(self, amount: float) -> float:
        return min(self.fixed_amount, amount)

@dataclass
class DiscountConfig:
    strategy: DiscountStrategy
    min_purchase: float = 0.0
    max_discount: float = float('inf')

class DiscountCalculator:
    def __init__(self, config: DiscountConfig):
        self.config = config
    
    def apply_discount(self, amount: float) -> float:
        if amount < self.config.min_purchase:
            return 0
        discount = self.config.strategy.calculate(amount)
        return min(discount, self.config.max_discount)

# Usage requires 30+ lines of setup for a simple calculation

✅ What Should Happen (Simple)

def calculate_discount(amount: float, percent: float) -> float:
    """Calculate discount amount. percent should be 0-100."""
    return amount * (percent / 100)

# Usage
discount = calculate_discount(100.0, 10.0)  # $10 off

When to add complexity: Only when you actually need multiple discount types. If that requirement comes later, refactor then.

Example 2: Speculative Features

User Request: "Save user preferences to database"

❌ What LLMs Do (Unnecessary Features)

class PreferenceManager:
    def __init__(self, db, cache=None, validator=None):
        self.db = db
        self.cache = cache or InMemoryCache()
        self.validator = validator or DefaultValidator()
    
    def save(self, user_id: int, prefs: dict, 
             merge: bool = True, 
             validate: bool = True,
             notify: bool = False) -> bool:
        """Save preferences with optional merging, validation, and notifications."""
        
        if validate:
            errors = self.validator.validate(prefs)
            if errors:
                raise ValidationError(errors)
        
        existing = self.get(user_id) if merge else {}
        
        if merge:
            existing.update(prefs)
            prefs = existing
        
        self.db.execute(
            "UPDATE users SET preferences = ? WHERE id = ?",
            (json.dumps(prefs), user_id)
        )
        
        if self.cache:
            self.cache.set(f"prefs:{user_id}", prefs)
        
        if notify:
            self.notify_preference_change(user_id, prefs)
        
        return True
    
    def notify_preference_change(self, user_id: int, prefs: dict):
        # Another 30 lines for a feature nobody asked for
        pass

✅ What Should Happen (Just What Was Asked)

def save_preferences(db, user_id: int, preferences: dict) -> None:
    """Save user preferences to database."""
    db.execute(
        "UPDATE users SET preferences = ? WHERE id = ?",
        (json.dumps(preferences), user_id)
    )

Add later if needed: Caching (when performance matters), validation (when bad data appears), merging (when requirement emerges).

3. Surgical Changes

Example 1: Drive-by Refactoring

User Request: "Fix the bug where empty emails crash the validator"

❌ What LLMs Do (Too Much)

  def validate_user(user_data):
-     # Check email format
-     if not user_data.get('email'):
+     """Validate user data."""
+     email = user_data.get('email', '').strip()
+     
+     # Validate email
+     if not email:
          raise ValueError("Email required")
-     
-     # Basic email validation
-     if '@' not in user_data['email']:
+     if '@' not in email or '.' not in email.split('@')[1]:
          raise ValueError("Invalid email")
      
-     # Check username
-     if not user_data.get('username'):
+     # Validate username
+     username = user_data.get('username', '').strip()
+     if not username:
          raise ValueError("Username required")
+     if len(username) < 3:
+         raise ValueError("Username too short")
+     if not username.isalnum():
+         raise ValueError("Username must be alphanumeric")
      
      return True

Problems:

"Improved" email validation beyond the bug fix
Added username validation nobody asked for
Changed comments
Added docstring

✅ What Should Happen (Surgical)

  def validate_user(user_data):
      # Check email format
-     if not user_data.get('email'):
+     email = user_data.get('email', '')
+     if not email or not email.strip():
          raise ValueError("Email required")
      
      # Basic email validation
-     if '@' not in user_data['email']:
+     if '@' not in email:
          raise ValueError("Invalid email")
      
      # Check username
      if not user_data.get('username'):
          raise ValueError("Username required")
      
      return True

Only changed: The specific lines that fix empty email handling.

Example 2: Style Drift

User Request: "Add logging to the upload function"

❌ What LLMs Do (Reformat Everything)

- def upload_file(file_path, destination):
+ def upload_file(file_path: str, destination: str) -> bool:
+     """Upload file to destination with logging."""
+     logger.info(f"Uploading {file_path} to {destination}")
+     
      try:
-         with open(file_path, 'rb') as f:
-             data = f.read()
-         
-         response = requests.post(destination, files={'file': data})
-         
-         if response.status_code == 200:
-             return True
-         else:
-             return False
+         with open(file_path, "rb") as f:
+             data = f.read()
+         response = requests.post(destination, files={"file": data})
+         success = response.status_code == 200
+         
+         if success:
+             logger.info(f"Upload successful: {file_path}")
+         else:
+             logger.error(f"Upload failed: {file_path}, status={response.status_code}")
+         
+         return success
      except Exception as e:
-         print(f"Error: {e}")
+         logger.exception(f"Upload error for {file_path}: {e}")
          return False

Problems:

Changed quote style ('' to "")
Added type hints nobody asked for
Added docstring
Reformatted whitespace
Changed boolean return logic

✅ What Should Happen (Match Existing Style)

+ import logging
+ 
+ logger = logging.getLogger(__name__)
+ 
  def upload_file(file_path, destination):
+     logger.info(f'Starting upload: {file_path}')
      try:
          with open(file_path, 'rb') as f:
              data = f.read()
          
          response = requests.post(destination, files={'file': data})
          
          if response.status_code == 200:
+             logger.info(f'Upload successful: {file_path}')
              return True
          else:
+             logger.error(f'Upload failed: {file_path}, status={response.status_code}')
              return False
      except Exception as e:
-         print(f"Error: {e}")
+         logger.exception(f'Upload error: {file_path}')
          return False

Matched: Single quotes, no type hints, existing boolean pattern, spacing style.

4. Goal-Driven Execution

Example 1: Vague vs. Verifiable

User Request: "Fix the authentication system"

❌ What LLMs Do (Vague Approach)

I'll fix the authentication system by:
1. Reviewing the code
2. Identifying issues
3. Making improvements
4. Testing the changes

[Proceeds to make changes without clear success criteria]

✅ What Should Happen (Verifiable Goals)

To "fix authentication," I need to define success criteria. 
What specific issue are we solving?

If the issue is "users stay logged in after password change":

Plan:
1. Write test: Change password → verify old session invalidated
   Verify: Test fails (reproduces bug)

2. Implement: Invalidate sessions on password change
   Verify: Test passes

3. Check edge cases: Multiple active sessions, concurrent changes
   Verify: Additional tests pass

4. Verify no regression: Existing auth tests still pass
   Verify: Full test suite green

Current test coverage for auth: [shows coverage]
What's the specific auth issue you're seeing?

Example 2: Multi-Step with Verification

User Request: "Add rate limiting to the API"

❌ What LLMs Do (All at Once)

# Implements full rate limiting with Redis, multiple strategies,
# configuration system, and monitoring in one 300-line commit
# No clear verification steps

✅ What Should Happen (Incremental with Verification)

Plan for rate limiting:

1. Add basic in-memory rate limiting (single endpoint)
   Verify: 
   - Test: 100 requests → first 10 succeed, rest get 429
   - Manual: curl endpoint 11 times, see rate limit error

2. Extract to middleware (apply to all endpoints)
   Verify:
   - Test: Rate limits apply to /users and /posts
   - Existing endpoint tests still pass

3. Add Redis backend (for multi-server)
   Verify:
   - Test: Rate limit persists across app restarts
   - Test: Two app instances share rate limit counter

4. Add configuration (rates per endpoint)
   Verify:
   - Test: /search allows 10/min, /users allows 100/min
   - Config file parsed correctly

Each step is independently verifiable and deployable.
Start with step 1?

Example 3: Test-First Verification

User Request: "The sorting breaks when there are duplicate scores"

❌ What LLMs Do (Fix Without Reproducing)

# Immediately changes sort logic without confirming the bug
def sort_scores(scores):
    return sorted(scores, key=lambda x: (-x['score'], x['name']))

✅ What Should Happen (Reproduce First)

# 1. First, write a test that reproduces the issue
def test_sort_with_duplicate_scores():
    """Test sorting when multiple items have same score."""
    scores = [
        {'name': 'Alice', 'score': 100},
        {'name': 'Bob', 'score': 100},
        {'name': 'Charlie', 'score': 90},
    ]
    
    result = sort_scores(scores)
    
    # The bug: order is non-deterministic for duplicates
    # Run this test multiple times, it should be consistent
    assert result[0]['score'] == 100
    assert result[1]['score'] == 100
    assert result[2]['score'] == 90

# Verify: Run test 10 times → fails with inconsistent ordering

# 2. Now fix with stable sort
def sort_scores(scores):
    """Sort by score descending, then name ascending for ties."""
    return sorted(scores, key=lambda x: (-x['score'], x['name']))

# Verify: Test passes consistently

Anti-Patterns Summary

Principle	Anti-Pattern	Fix
Think Before Coding	Silently assumes file format, fields, scope	List assumptions explicitly, ask for clarification
Simplicity First	Strategy pattern for single discount calculation	One function until complexity is actually needed
Surgical Changes	Reformats quotes, adds type hints while fixing bug	Only change lines that fix the reported issue
Goal-Driven	"I'll review and improve the code"	"Write test for bug X → make it pass → verify no regressions"

Key Insight

The "overcomplicated" examples aren't obviously wrong—they follow design patterns and best practices. The problem is timing: they add complexity before it's needed, which:

Makes code harder to understand
Introduces more bugs
Takes longer to implement
Harder to test

The "simple" versions are:

Easier to understand
Faster to implement
Easier to test
Can be refactored later when complexity is actually needed

Good code is code that solves today's problem simply, not tomorrow's problem prematurely.

coding-guidelines-python

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

coding-guidelines-python

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Python Coding Guidelines

Typing

Type checking — ty

Tests

Abstract base classes

No mutable default arguments

Dataclasses over raw dicts

Enums for string constants

No module-level mutable state

References

Pre-commit checklist (Python)

Examples

1. Think Before Coding

Example 1: Hidden Assumptions

Example 2: Multiple Interpretations

2. Simplicity First

Example 1: Over-abstraction

Example 2: Speculative Features

3. Surgical Changes

Example 1: Drive-by Refactoring

Example 2: Style Drift

4. Goal-Driven Execution

Example 1: Vague vs. Verifiable

Example 2: Multi-Step with Verification

Example 3: Test-First Verification

Anti-Patterns Summary

Key Insight

Similar Skills

Help us improve

Python Coding Guidelines

Typing

Type checking — ty

Tests

Abstract base classes

No mutable default arguments

Dataclasses over raw dicts

Enums for string constants

No module-level mutable state

References

Pre-commit checklist (Python)

Examples

1. Think Before Coding

Example 1: Hidden Assumptions

Example 2: Multiple Interpretations

2. Simplicity First

Example 1: Over-abstraction

Example 2: Speculative Features

3. Surgical Changes

Example 1: Drive-by Refactoring

Example 2: Style Drift

4. Goal-Driven Execution

Example 1: Vague vs. Verifiable

Example 2: Multi-Step with Verification

Example 3: Test-First Verification

Anti-Patterns Summary

Key Insight