From thinking-frameworks-skills
Detects and removes duplicate transactions from overlapping bank, credit-card, and brokerage statement imports using composite key (account_id, date±1d, amount_cents, description_normalized). Outputs new transactions, suppressed duplicates with reasons, and near-duplicates for review.
npx claudepluginhub lyndonkl/claude --plugin thinking-frameworks-skillsThis skill uses the workspace's default tool permissions.
- [Overview](#overview)
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Statement drops often overlap. A January statement covers December 15 → January 14; the December statement covers November 15 → December 14; the same December 15 transaction appears in both. This skill identifies those duplicates without losing legitimate same-day same-amount same-merchant repeat charges (e.g., two coffees in one day).
The caller provides:
incoming — array of newly extracted transactions: {date, post_date, account_id, amount_cents, description_raw, source}.existing — array of transactions already in the store with the same fields plus id.A duplicate is identified by the tuple:
(account_id, abs(amount_cents), description_normalized, |date_a - date_b| <= 1 day)
description_normalized uses the same normalization as the categorizer (uppercase, strip vendor codes, strip geo, collapse spaces, drop dates).date vs post_date mismatches between two sources.abs(amount_cents) allows a refund matched against the original purchase to NOT be considered a duplicate (different sign). The composite key uses signed amount.Use signed amount_cents. Refunds (opposite sign) are never duplicates of purchases.
Dedupe Progress:
- [ ] Step 1: Index existing transactions by (account_id, signed_amount, normalized_desc)
- [ ] Step 2: For each incoming, look up the index
- [ ] Step 3: Filter index hits by date proximity (≤ 1 day)
- [ ] Step 4: If no hit, mark as new
- [ ] Step 5: If exactly one hit, mark as duplicate of that id
- [ ] Step 6: If multiple hits, run the multi-instance same-day rule
- [ ] Step 7: Surface near-duplicates (different amount or desc) for review
Build existing_by_key[(account_id, amount_cents, description_normalized)] = [tx, …].
For each incoming transaction, compute its key tuple and look up the bucket.
For each candidate in the bucket, keep only those with |incoming.date − candidate.date| ≤ 1 day. Use min(date, post_date) on each side if post_date exists.
Mark decision: "new". The bookkeeper will append it to transactions.json.
Mark decision: "duplicate" and link duplicate_of: <existing_id>. Do not import.
When the existing store already has N transactions with the identical key on the same day, and the incoming batch contains M transactions with the same key on that day:
M ≤ N → all incoming considered duplicates of existing ones (1:1 pairing in date order).M > N → the first N incoming are duplicates; the remaining M − N are new transactions (legitimate same-day repeat charges, e.g., two coffees, gas-station pre-auth + final).This rule preserves real repeat charges while still suppressing overlap-import duplicates.
A near-duplicate shares everything except amount or description and is within 1 day. These commonly arise when:
Emit these to review[] with both records side-by-side and a suggested action: keep_incoming_drop_existing | keep_existing_drop_incoming | keep_both | merge.
Compute a similarity score on near-misses:
near_dup_score = 0.4*amount + 0.4*description + 0.2*date.
Surface for review when 0.7 ≤ near_dup_score < 0.95. Above 0.95 is treated as duplicate; below 0.7 is treated as independent.
{
"new": [
{ "id": "tx_20260115_017", "decision": "new" }
],
"duplicates": [
{
"incoming_index": 4,
"decision": "duplicate",
"duplicate_of": "tx_20251220_003",
"reason": "exact key match within 1 day window"
}
],
"review": [
{
"incoming_index": 12,
"matched_existing_id": "tx_20260108_005",
"near_dup_score": 0.86,
"diff": {
"amount_cents": [-4500, -4583],
"description_raw": ["AMAZON PENDING", "AMZN MKTP US*AB12CD"]
},
"suggested_action": "keep_incoming_drop_existing",
"rationale": "incoming is the finalized charge (post_date set, definite merchant code)"
}
],
"summary": {
"incoming_total": 142,
"new_count": 96,
"duplicate_count": 44,
"review_count": 2
}
}
description_raw strings to the human; do not show the normalized form.existing.duplicate_of so the user can trace why a transaction did not appear in the new import.