PolicyEngine code writing style guide - formula optimization, direct returns, eliminating unnecessary variables
From essentialnpx claudepluginhub policyengine/policyengine-claude --plugin data-scienceThis skill uses the workspace's default tool permissions.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Configures VPN and dedicated connections like Direct Connect, ExpressRoute, Interconnect for secure on-premises to AWS, Azure, GCP, OCI hybrid networking.
Essential patterns for writing clean, efficient PolicyEngine formulas.
def formula(spm_unit, period, parameters):
countable = spm_unit("tn_tanf_countable_resources", period)
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
resource_limit = p.amount # ❌ Unnecessary
return countable <= resource_limit
def formula(spm_unit, period, parameters):
countable = spm_unit("tn_tanf_countable_resources", period)
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
return countable <= p.amount
def formula(spm_unit, period, parameters):
assets = spm_unit("spm_unit_assets", period.this_year)
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
vehicle_exemption = p.vehicle_exemption # ❌ Unnecessary
countable = max_(assets - vehicle_exemption, 0) # ❌ Unnecessary
return countable
def formula(spm_unit, period, parameters):
assets = spm_unit("spm_unit_assets", period.this_year)
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
return max_(assets - p.vehicle_exemption, 0)
def formula(spm_unit, period, parameters):
person = spm_unit.members
age = person("age", period.this_year)
is_disabled = person("is_disabled", period.this_year)
caretaker_is_60_or_older = spm_unit.any(age >= 60) # ❌ Unnecessary
caretaker_is_disabled = spm_unit.any(is_disabled) # ❌ Unnecessary
eligible = caretaker_is_60_or_older | caretaker_is_disabled # ❌ Unnecessary
return eligible
def formula(spm_unit, period, parameters):
person = spm_unit.members
age = person("age", period.this_year)
is_disabled = person("is_disabled", period.this_year)
return spm_unit.any((age >= 60) | is_disabled)
# ❌ Bad
age = person("age", period) # Gives age/12
monthly_income = person("employment_income", period.this_year) / MONTHS_IN_YEAR # Redundant
# ✅ Good
age = person("age", period.this_year) # Gets actual age
monthly_income = person("employment_income", period) # Auto-converts to monthly
period (want monthly from annual)period.this_year (don't divide by 12)See policyengine-period-patterns skill for the full decision tree and more examples.
def formula(spm_unit, period, parameters):
size = spm_unit.nb_persons()
capped_size = min_(size, 10) # ❌ Hardcoded
age = person("age", period.this_year)
income = person("income", period) / 12 # ❌ Use MONTHS_IN_YEAR
# ❌ Hardcoded thresholds
if age >= 18 and age <= 65 and income < 2000:
return True
def formula(spm_unit, period, parameters):
p = parameters(period).gov.program
capped_size = min_(spm_unit.nb_persons(), p.max_unit_size) # ✅
age = person("age", period.this_year)
monthly_income = person("income", period) # ✅ Auto-converts (no manual /12)
age_eligible = (age >= p.age_min) & (age <= p.age_max) # ✅
income_eligible = monthly_income < p.income_threshold # ✅
return age_eligible & income_eligible
PolicyEngine provides constants for universal conversion factors. Don't create parameters for these:
# ❌ BAD — created a weeks_per_month.yaml parameter:
weekly_subsidy * p.weeks_per_month
# ✅ GOOD — use framework constants:
weekly_subsidy * (WEEKS_IN_YEAR / MONTHS_IN_YEAR)
Available constants: MONTHS_IN_YEAR (12), WEEKS_IN_YEAR (52). Derive others from these.
CRITICAL: WEEKS_IN_YEAR is the integer 52, not 52.1429. Never use WEEKS_IN_YEAR * 7 for days per year — that gives 364, not 365. Use the literal 365 for days-per-year calculations.
Regulatory conversion factors take precedence. When a regulation cites a specific factor (e.g., "multiply by 4.3"), use that exact value even if a framework constant is close (WEEKS_IN_YEAR / MONTHS_IN_YEAR ≈ 4.333). The discrepancy may affect benefit amounts at boundary conditions.
def formula(spm_unit, period, parameters):
unit_size = spm_unit.nb_persons() # ❌ Unnecessary
max_size = 10 # ❌ Hardcoded
capped_size = min_(unit_size, max_size)
p = parameters(period).gov.states.tn.dhs.tanf.benefit
spa = p.standard_payment_amount[capped_size] # ❌ Unnecessary
dgpa = p.differential_grant_payment_amount[capped_size] # ❌ Unnecessary
eligible = spm_unit("eligible_for_dgpa", period)
return where(eligible, dgpa, spa)
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.tn.dhs.tanf.benefit
capped_size = min_(spm_unit.nb_persons(), p.max_unit_size)
eligible = spm_unit("eligible_for_dgpa", period)
return where(
eligible,
p.differential_grant_payment_amount[capped_size],
p.standard_payment_amount[capped_size]
)
def formula(tax_unit, period, parameters):
p = parameters(period).gov.irs.credits
filing_status = tax_unit("filing_status", period)
# ✅ Used multiple times - keep as variable
threshold = p.phase_out.start[filing_status]
income = tax_unit("adjusted_gross_income", period)
excess = max_(0, income - threshold)
reduction = (excess / p.phase_out.width) * threshold
return max_(0, threshold - reduction)
def formula(spm_unit, period, parameters):
p = parameters(period).gov.program
gross_earned = spm_unit("gross_earned_income", period)
# ✅ Complex multi-step calculation - break it down
work_expense_deduction = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
after_work_expense = gross_earned - work_expense_deduction
earned_disregard = after_work_expense * p.earned_disregard_rate
countable_earned = after_work_expense - earned_disregard
dependent_care = spm_unit("dependent_care_expenses", period)
return max_(0, countable_earned - dependent_care)
Don't inline complex calculations inside where(), max_(), or other function calls - give them descriptive names.
# ❌ BAD - Complex expression inlined in where()
return where(
above_trigger,
reduced_payment,
max_(maximum_benefit - countable_income, 0), # Hard to read
)
# ✅ GOOD - Break out into named variable
standard_payment = max_(maximum_benefit - countable_income, 0)
return where(
above_trigger,
reduced_payment,
standard_payment, # Clear what this represents
)
Another example:
# ❌ BAD - Multiple complex inlined expressions
return where(
income > add(spm_unit, period, ["earned", "unearned"]) * p.rate,
max_(benefit - (income * p.reduction_rate), 0),
benefit,
)
# ✅ GOOD - Named variables explain the logic
gross_income = add(spm_unit, period, ["earned", "unearned"])
income_threshold = gross_income * p.rate
reduced_benefit = max_(benefit - (income * p.reduction_rate), 0)
return where(
income > income_threshold,
reduced_benefit,
benefit,
)
Rule: If it's more than a simple variable or parameter access, give it a name.
add() > 0 Instead of spm_unit.any()When checking if ANY member has a boolean property, use add() > 0 instead of spm_unit.members + spm_unit.any().
# ❌ LESS PREFERRED - verbose pattern:
person = spm_unit.members
has_citizen = spm_unit.any(
person("is_citizen_or_legal_immigrant", period)
)
# ✅ BETTER - cleaner add() > 0 pattern:
immigration_eligible = add(spm_unit, period, ["is_citizen_or_legal_immigrant"]) > 0
Why this is better:
person = spm_unit.members variableadd() patterns used elsewhereimmigration_eligible vs has_citizen)More examples:
# Check if any member is disabled
has_disabled_member = add(spm_unit, period, ["is_disabled"]) > 0
# Check if any member is elderly
has_elderly_member = add(spm_unit, period, ["is_elderly"]) > 0
# Check if any child is present
has_child = add(spm_unit, period, ["is_child"]) > 0
If an spm_unit-level variable exists, use it directly. Don't access through person level.
# ❌ BAD - Unnecessary person-level access when spm_unit variable exists:
person = spm_unit.members
demographic = person("is_person_demographic_tanf_eligible", period)
# Then aggregating back to spm_unit...
# ✅ GOOD - Use the spm_unit-level variable directly:
demographic_eligible = spm_unit("is_demographic_tanf_eligible", period)
Before writing code, check:
spm_unit("variable_name", period) directlyspm_unit.members when you need person-level data that must be aggregated# ✅ GOOD - Using existing spm_unit variables:
income_eligible = spm_unit("ar_tea_income_eligible", period)
resource_eligible = spm_unit("ar_tea_resource_eligible", period)
demographic_eligible = spm_unit("is_demographic_tanf_eligible", period)
return income_eligible & resource_eligible & demographic_eligible
def formula(person, period, parameters):
# Wrong period access
age = person("age", period) # ❌ age/12
assets = person("assets", period) # ❌ assets/12
annual_income = person("employment_income", period.this_year)
monthly_income = annual_income / 12 # ❌ Use MONTHS_IN_YEAR
# Hardcoded values
min_age = 18 # ❌
max_age = 64 # ❌
asset_limit = 10000 # ❌
income_limit = 2000 # ❌
# Unnecessary intermediate variables
age_check = (age >= min_age) & (age <= max_age)
asset_check = assets <= asset_limit
income_check = monthly_income <= income_limit
eligible = age_check & asset_check & income_check
return eligible
def formula(person, period, parameters):
p = parameters(period).gov.program.eligibility
# Correct period access
age = person("age", period.this_year)
assets = person("assets", period.this_year)
monthly_income = person("employment_income", period)
# Direct return with combined logic
return (
(age >= p.age_min) & (age <= p.age_max) &
(assets <= p.asset_limit) &
(monthly_income <= p.income_threshold)
)
Variable names and structure should explain the code - not comments.
def formula(spm_unit, period, parameters):
# Wisconsin disregards all earned income of dependent children (< 18)
# Calculate earned income for adults only
is_adult = spm_unit.members("age", period.this_year) >= 18 # Hard-coded!
adult_earned = spm_unit.sum(
spm_unit.members("tanf_gross_earned_income", period) * is_adult
)
# All unearned income is counted (including children's)
gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
# NOTE: Wisconsin disregards many additional income sources that
# are not separately tracked in PolicyEngine (educational aid, etc.)
return max_(total_income - disregards, 0)
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.wi.dcf.tanf.income
is_adult = spm_unit.members("age", period.this_year) >= p.adult_age_threshold
adult_earned = spm_unit.sum(
spm_unit.members("tanf_gross_earned_income", period) * is_adult
)
gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
child_support = add(spm_unit, period, ["child_support_received"])
return max_(adult_earned + gross_unearned - child_support, 0)
# NOTE: Disregard rate varies by calendar month
When a formula algebraically guarantees result <= upper_bound, remove the redundant min_():
# ❌ BAD — redundant (payment_standard - positive_income ≤ payment_standard always):
benefit = min_(max_(payment_standard - countable_income, 0), payment_standard)
# ✅ GOOD — max_(x, 0) already ensures result ≤ payment_standard:
benefit = max_(payment_standard - countable_income, 0)
Only use min_() when the bound can actually bind (e.g., capping subsidy at actual expenses or provider charges).
Before finalizing code:
periodperiod.this_yearwhere(), max_() into named variablesp.amount not amount = p.amount)min_() / max_() when algebraically guaranteedWEEKS_IN_YEAR = 52, not 52.14)