Skill

policyengine-code-style

PolicyEngine code writing style guide - formula optimization, direct returns, eliminating unnecessary variables

Install

Run in your terminal

npx claudepluginhub policyengine/policyengine-claude --plugin data-science

Tool Access

This skill uses the workspace's default tool permissions.

Skill Content

Similar Skills

cache-components

Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.

cache-components

138.6k

claude-opus-4-5-migration

2 files

Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.

claude-opus-4-5-migration

83.2k

hybrid-cloud-networking

1 file

Configures VPN and dedicated connections like Direct Connect, ExpressRoute, Interconnect for secure on-premises to AWS, Azure, GCP, OCI hybrid networking.

cloud-infrastructure

33.0k

Stats

Stars26

Forks5

Last CommitMar 26, 2026

Actions

View Source View Plugin View on GitHub View README

PolicyEngine Code Writing Style Guide

Essential patterns for writing clean, efficient PolicyEngine formulas.

Core Principles

Eliminate unnecessary intermediate variables
Use direct parameter/variable access
Return directly when possible
Combine boolean logic
Use correct period access (period vs period.this_year)
NO hardcoded values - use parameters or constants

Pattern 1: Direct Parameter Access

❌ Bad - Unnecessary intermediate variable

def formula(spm_unit, period, parameters):
    countable = spm_unit("tn_tanf_countable_resources", period)
    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
    resource_limit = p.amount  # ❌ Unnecessary
    return countable <= resource_limit

✅ Good - Direct access

def formula(spm_unit, period, parameters):
    countable = spm_unit("tn_tanf_countable_resources", period)
    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
    return countable <= p.amount

Pattern 2: Direct Return

❌ Bad - Unnecessary result variable

def formula(spm_unit, period, parameters):
    assets = spm_unit("spm_unit_assets", period.this_year)
    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
    vehicle_exemption = p.vehicle_exemption  # ❌ Unnecessary
    countable = max_(assets - vehicle_exemption, 0)  # ❌ Unnecessary
    return countable

✅ Good - Direct return

def formula(spm_unit, period, parameters):
    assets = spm_unit("spm_unit_assets", period.this_year)
    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
    return max_(assets - p.vehicle_exemption, 0)

Pattern 3: Combined Boolean Logic

❌ Bad - Too many intermediate booleans

def formula(spm_unit, period, parameters):
    person = spm_unit.members
    age = person("age", period.this_year)
    is_disabled = person("is_disabled", period.this_year)

    caretaker_is_60_or_older = spm_unit.any(age >= 60)  # ❌ Unnecessary
    caretaker_is_disabled = spm_unit.any(is_disabled)   # ❌ Unnecessary
    eligible = caretaker_is_60_or_older | caretaker_is_disabled  # ❌ Unnecessary

    return eligible

✅ Good - Combined logic

def formula(spm_unit, period, parameters):
    person = spm_unit.members
    age = person("age", period.this_year)
    is_disabled = person("is_disabled", period.this_year)

    return spm_unit.any((age >= 60) | is_disabled)

Pattern 4: Period Access - period vs period.this_year

# ❌ Bad
age = person("age", period)  # Gives age/12
monthly_income = person("employment_income", period.this_year) / MONTHS_IN_YEAR  # Redundant

# ✅ Good
age = person("age", period.this_year)  # Gets actual age
monthly_income = person("employment_income", period)  # Auto-converts to monthly

Income/flows → Use period (want monthly from annual)
Age/assets/counts/booleans → Use period.this_year (don't divide by 12)

See policyengine-period-patterns skill for the full decision tree and more examples.

Pattern 5: No Hardcoded Values

❌ Bad - Hardcoded numbers

def formula(spm_unit, period, parameters):
    size = spm_unit.nb_persons()
    capped_size = min_(size, 10)  # ❌ Hardcoded

    age = person("age", period.this_year)
    income = person("income", period) / 12  # ❌ Use MONTHS_IN_YEAR

    # ❌ Hardcoded thresholds
    if age >= 18 and age <= 65 and income < 2000:
        return True

✅ Good - Parameterized

def formula(spm_unit, period, parameters):
    p = parameters(period).gov.program
    capped_size = min_(spm_unit.nb_persons(), p.max_unit_size)  # ✅

    age = person("age", period.this_year)
    monthly_income = person("income", period)  # ✅ Auto-converts (no manual /12)

    age_eligible = (age >= p.age_min) & (age <= p.age_max)  # ✅
    income_eligible = monthly_income < p.income_threshold  # ✅

    return age_eligible & income_eligible

Use Framework Constants, Not Custom Parameters

PolicyEngine provides constants for universal conversion factors. Don't create parameters for these:

# ❌ BAD — created a weeks_per_month.yaml parameter:
weekly_subsidy * p.weeks_per_month

# ✅ GOOD — use framework constants:
weekly_subsidy * (WEEKS_IN_YEAR / MONTHS_IN_YEAR)

Available constants: MONTHS_IN_YEAR (12), WEEKS_IN_YEAR (52). Derive others from these.

CRITICAL: WEEKS_IN_YEAR is the integer 52, not 52.1429. Never use WEEKS_IN_YEAR * 7 for days per year — that gives 364, not 365. Use the literal 365 for days-per-year calculations.

Regulatory conversion factors take precedence. When a regulation cites a specific factor (e.g., "multiply by 4.3"), use that exact value even if a framework constant is close (WEEKS_IN_YEAR / MONTHS_IN_YEAR ≈ 4.333). The discrepancy may affect benefit amounts at boundary conditions.

Pattern 6: Streamline Variable Access

❌ Bad - Redundant steps

def formula(spm_unit, period, parameters):
    unit_size = spm_unit.nb_persons()  # ❌ Unnecessary
    max_size = 10  # ❌ Hardcoded
    capped_size = min_(unit_size, max_size)

    p = parameters(period).gov.states.tn.dhs.tanf.benefit
    spa = p.standard_payment_amount[capped_size]  # ❌ Unnecessary
    dgpa = p.differential_grant_payment_amount[capped_size]  # ❌ Unnecessary

    eligible = spm_unit("eligible_for_dgpa", period)
    return where(eligible, dgpa, spa)

✅ Good - Streamlined

def formula(spm_unit, period, parameters):
    p = parameters(period).gov.states.tn.dhs.tanf.benefit
    capped_size = min_(spm_unit.nb_persons(), p.max_unit_size)
    eligible = spm_unit("eligible_for_dgpa", period)

    return where(
        eligible,
        p.differential_grant_payment_amount[capped_size],
        p.standard_payment_amount[capped_size]
    )

When to Keep Intermediate Variables

✅ Keep when value is used multiple times

def formula(tax_unit, period, parameters):
    p = parameters(period).gov.irs.credits
    filing_status = tax_unit("filing_status", period)

    # ✅ Used multiple times - keep as variable
    threshold = p.phase_out.start[filing_status]

    income = tax_unit("adjusted_gross_income", period)
    excess = max_(0, income - threshold)
    reduction = (excess / p.phase_out.width) * threshold

    return max_(0, threshold - reduction)

✅ Keep when calculation is complex

def formula(spm_unit, period, parameters):
    p = parameters(period).gov.program
    gross_earned = spm_unit("gross_earned_income", period)

    # ✅ Complex multi-step calculation - break it down
    work_expense_deduction = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
    after_work_expense = gross_earned - work_expense_deduction

    earned_disregard = after_work_expense * p.earned_disregard_rate
    countable_earned = after_work_expense - earned_disregard

    dependent_care = spm_unit("dependent_care_expenses", period)

    return max_(0, countable_earned - dependent_care)

✅ Break out complex expressions inside function calls

Don't inline complex calculations inside where(), max_(), or other function calls - give them descriptive names.

# ❌ BAD - Complex expression inlined in where()
return where(
    above_trigger,
    reduced_payment,
    max_(maximum_benefit - countable_income, 0),  # Hard to read
)

# ✅ GOOD - Break out into named variable
standard_payment = max_(maximum_benefit - countable_income, 0)
return where(
    above_trigger,
    reduced_payment,
    standard_payment,  # Clear what this represents
)

Another example:

# ❌ BAD - Multiple complex inlined expressions
return where(
    income > add(spm_unit, period, ["earned", "unearned"]) * p.rate,
    max_(benefit - (income * p.reduction_rate), 0),
    benefit,
)

# ✅ GOOD - Named variables explain the logic
gross_income = add(spm_unit, period, ["earned", "unearned"])
income_threshold = gross_income * p.rate
reduced_benefit = max_(benefit - (income * p.reduction_rate), 0)

return where(
    income > income_threshold,
    reduced_benefit,
    benefit,
)

Rule: If it's more than a simple variable or parameter access, give it a name.

Pattern 8: Use `add() > 0` Instead of `spm_unit.any()`

When checking if ANY member has a boolean property, use add() > 0 instead of spm_unit.members + spm_unit.any().

# ❌ LESS PREFERRED - verbose pattern:
person = spm_unit.members
has_citizen = spm_unit.any(
    person("is_citizen_or_legal_immigrant", period)
)

# ✅ BETTER - cleaner add() > 0 pattern:
immigration_eligible = add(spm_unit, period, ["is_citizen_or_legal_immigrant"]) > 0

Why this is better:

Avoids intermediate person = spm_unit.members variable
Consistent with add() patterns used elsewhere
More descriptive variable name (immigration_eligible vs has_citizen)
Single line instead of multiple

More examples:

# Check if any member is disabled
has_disabled_member = add(spm_unit, period, ["is_disabled"]) > 0

# Check if any member is elderly
has_elderly_member = add(spm_unit, period, ["is_elderly"]) > 0

# Check if any child is present
has_child = add(spm_unit, period, ["is_child"]) > 0

Pattern 9: Use Existing SPMUnit-Level Variables Directly

If an spm_unit-level variable exists, use it directly. Don't access through person level.

# ❌ BAD - Unnecessary person-level access when spm_unit variable exists:
person = spm_unit.members
demographic = person("is_person_demographic_tanf_eligible", period)
# Then aggregating back to spm_unit...

# ✅ GOOD - Use the spm_unit-level variable directly:
demographic_eligible = spm_unit("is_demographic_tanf_eligible", period)

Before writing code, check:

Does an spm_unit-level variable already exist?
If yes, use spm_unit("variable_name", period) directly
Only use spm_unit.members when you need person-level data that must be aggregated

# ✅ GOOD - Using existing spm_unit variables:
income_eligible = spm_unit("ar_tea_income_eligible", period)
resource_eligible = spm_unit("ar_tea_resource_eligible", period)
demographic_eligible = spm_unit("is_demographic_tanf_eligible", period)

return income_eligible & resource_eligible & demographic_eligible

Complete Example: Before vs After

❌ Before - Multiple Issues

def formula(person, period, parameters):
    # Wrong period access
    age = person("age", period)  # ❌ age/12
    assets = person("assets", period)  # ❌ assets/12
    annual_income = person("employment_income", period.this_year)
    monthly_income = annual_income / 12  # ❌ Use MONTHS_IN_YEAR

    # Hardcoded values
    min_age = 18  # ❌
    max_age = 64  # ❌
    asset_limit = 10000  # ❌
    income_limit = 2000  # ❌

    # Unnecessary intermediate variables
    age_check = (age >= min_age) & (age <= max_age)
    asset_check = assets <= asset_limit
    income_check = monthly_income <= income_limit
    eligible = age_check & asset_check & income_check

    return eligible

✅ After - Clean and Correct

def formula(person, period, parameters):
    p = parameters(period).gov.program.eligibility

    # Correct period access
    age = person("age", period.this_year)
    assets = person("assets", period.this_year)
    monthly_income = person("employment_income", period)

    # Direct return with combined logic
    return (
        (age >= p.age_min) & (age <= p.age_max) &
        (assets <= p.asset_limit) &
        (monthly_income <= p.income_threshold)
    )

Pattern 7: Minimal Comments

Code Should Be Self-Documenting

Variable names and structure should explain the code - not comments.

❌ Bad - Verbose explanatory comments

def formula(spm_unit, period, parameters):
    # Wisconsin disregards all earned income of dependent children (< 18)
    # Calculate earned income for adults only
    is_adult = spm_unit.members("age", period.this_year) >= 18  # Hard-coded!
    adult_earned = spm_unit.sum(
        spm_unit.members("tanf_gross_earned_income", period) * is_adult
    )

    # All unearned income is counted (including children's)
    gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])

    # NOTE: Wisconsin disregards many additional income sources that
    # are not separately tracked in PolicyEngine (educational aid, etc.)
    return max_(total_income - disregards, 0)

✅ Good - Clean self-documenting code

def formula(spm_unit, period, parameters):
    p = parameters(period).gov.states.wi.dcf.tanf.income

    is_adult = spm_unit.members("age", period.this_year) >= p.adult_age_threshold
    adult_earned = spm_unit.sum(
        spm_unit.members("tanf_gross_earned_income", period) * is_adult
    )
    gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
    child_support = add(spm_unit, period, ["child_support_received"])

    return max_(adult_earned + gross_unearned - child_support, 0)

Comment Rules

NO comments explaining what code does - variable names should be clear
OK: Brief NOTE about implementation decisions (one line):
```
# NOTE: Disregard rate varies by calendar month
```
NO multi-line explanations of what the code calculates

Pattern 10: Remove Redundant Operations

Don't cap what's already capped

When a formula algebraically guarantees result <= upper_bound, remove the redundant min_():

# ❌ BAD — redundant (payment_standard - positive_income ≤ payment_standard always):
benefit = min_(max_(payment_standard - countable_income, 0), payment_standard)

# ✅ GOOD — max_(x, 0) already ensures result ≤ payment_standard:
benefit = max_(payment_standard - countable_income, 0)

Only use min_() when the bound can actually bind (e.g., capping subsidy at actual expenses or provider charges).

Quick Checklist

Before finalizing code:

policyengine-code-style

policyengine-code-style

PolicyEngine Code Writing Style Guide

Core Principles

Pattern 1: Direct Parameter Access

❌ Bad - Unnecessary intermediate variable

✅ Good - Direct access

Pattern 2: Direct Return

❌ Bad - Unnecessary result variable

✅ Good - Direct return

Pattern 3: Combined Boolean Logic

❌ Bad - Too many intermediate booleans

✅ Good - Combined logic

Pattern 4: Period Access - period vs period.this_year

Pattern 5: No Hardcoded Values

❌ Bad - Hardcoded numbers

✅ Good - Parameterized

Use Framework Constants, Not Custom Parameters

Pattern 6: Streamline Variable Access

❌ Bad - Redundant steps

✅ Good - Streamlined

When to Keep Intermediate Variables

✅ Keep when value is used multiple times

✅ Keep when calculation is complex

✅ Break out complex expressions inside function calls

Pattern 8: Use add() > 0 Instead of spm_unit.any()

Pattern 9: Use Existing SPMUnit-Level Variables Directly

Complete Example: Before vs After

❌ Before - Multiple Issues

✅ After - Clean and Correct

Pattern 7: Minimal Comments

Code Should Be Self-Documenting

❌ Bad - Verbose explanatory comments

✅ Good - Clean self-documenting code

Comment Rules

Pattern 10: Remove Redundant Operations

Don't cap what's already capped

Quick Checklist

PolicyEngine Code Writing Style Guide

Core Principles

Pattern 1: Direct Parameter Access

❌ Bad - Unnecessary intermediate variable

✅ Good - Direct access

Pattern 2: Direct Return

❌ Bad - Unnecessary result variable

✅ Good - Direct return

Pattern 3: Combined Boolean Logic

❌ Bad - Too many intermediate booleans

✅ Good - Combined logic

Pattern 4: Period Access - period vs period.this_year

Pattern 5: No Hardcoded Values

❌ Bad - Hardcoded numbers

✅ Good - Parameterized

Use Framework Constants, Not Custom Parameters

Pattern 6: Streamline Variable Access

❌ Bad - Redundant steps

✅ Good - Streamlined

When to Keep Intermediate Variables

✅ Keep when value is used multiple times

✅ Keep when calculation is complex

✅ Break out complex expressions inside function calls

Pattern 8: Use add() > 0 Instead of spm_unit.any()

Pattern 9: Use Existing SPMUnit-Level Variables Directly

Complete Example: Before vs After

❌ Before - Multiple Issues

✅ After - Clean and Correct

Pattern 7: Minimal Comments

Code Should Be Self-Documenting

❌ Bad - Verbose explanatory comments

✅ Good - Clean self-documenting code

Comment Rules

Pattern 10: Remove Redundant Operations

Don't cap what's already capped

Quick Checklist

Pattern 8: Use `add() > 0` Instead of `spm_unit.any()`

Pattern 8: Use `add() > 0` Instead of `spm_unit.any()`