Help us improve
Share bugs, ideas, or general feedback.
From immich-photo-manager
Runs perceptual hash duplicate analysis on an Immich photo library to find cross-source and internal duplicates, generating a detailed report with removal recommendations.
npx claudepluginhub drolosoft/immich-photo-manager --plugin immich-photo-managerHow this skill is triggered — by the user, by Claude, or both
Slash command
/immich-photo-manager:duplicate-reportThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Before doing ANYTHING else in this skill, call `ping` on the Immich MCP server.**
Detects and helps remove screenshots, duplicates, and low-quality photos from an Immich library. Useful for freeing up space and organizing photo collections.
Finds duplicate or near-duplicate images in FiftyOne datasets using brain similarity computation. Use when deduplicating datasets, finding similar images, or removing redundant samples.
Recovers deleted files from disk images and storage media using PhotoRec's file signature-based carving engine, even when the file system is damaged.
Share bugs, ideas, or general feedback.
Before doing ANYTHING else in this skill, call ping on the Immich MCP server.
ping succeeds → proceed with the skill normally.ping fails or the MCP tools are not available → STOP. Do not continue. Tell the user:❌ Immich is not connected. This plugin needs a running Immich MCP server to work.
Run /setup-immich-photo-manager to configure your Immich connection. You'll need:
- Your Immich server URL (e.g.,
http://192.168.1.100:2283)- An Immich API key (how to create one)
- The MCP server configured (see /setup-immich-photo-manager)
Nothing in this plugin will work until the connection is configured.
Do NOT skip this check. Do NOT try to run any other tool first. Always ping, always block if it fails.
Generate a comprehensive duplicate analysis of an Immich photo library. Uses perceptual hashing to find visually identical photos even when they have different checksums (common when photos are exported from Apple Photos and Google Photos).
When users import the same photo library from multiple sources (Apple Photos export, Google Takeout, manual folder copies), the files are often re-encoded by each platform. This means:
Perceptual hashing (pHash) computes a fingerprint based on the visual content of the image, not the binary data. Two re-encoded copies of the same photo produce the same perceptual hash.
The user's machine needs:
pip3 install Pillow imagehash pillow-heif --break-system-packages
Pillow — image loadingimagehash — perceptual hashingpillow-heif — HEIC/HEIF support (critical for Apple Photos)Before running the full perceptual hash scan, check Immich's built-in ML duplicate detection:
result = get_duplicates()
This returns groups of visually similar assets detected by Immich's ML engine. Present the count and let the user resolve obvious duplicates immediately using resolve_duplicates.
This is fast (no disk scan needed) but may miss re-encoded copies across import sources. For comprehensive cross-source analysis, proceed to Step 1.
Note:
resolve_duplicateshandles Immich ML duplicates natively. Perceptual hashing (Steps 1–3 below) catches cross-source re-encoded duplicates that ML may miss.
Query Immich to identify distinct import sources from asset paths:
SELECT
CASE
WHEN "originalPath" LIKE '%Apple Fotos%' OR "originalPath" LIKE '%Apple Photos%' THEN 'Apple Photos'
WHEN "originalPath" LIKE '%Google Fotos%' OR "originalPath" LIKE '%Google Photos%' THEN 'Google Photos'
ELSE split_part("originalPath", '/', 5) -- or whatever level gives the source folder
END as source,
count(*) as total
FROM asset WHERE "deletedAt" IS NULL
GROUP BY source ORDER BY total DESC;
Present the sources to the user and ask which ones to compare.
For each source directory, scan all image files and compute 256-bit perceptual hashes:
from pillow_heif import register_heif_opener
register_heif_opener()
from PIL import Image
import imagehash
def compute_phash(filepath):
with Image.open(filepath) as img:
if img.mode != 'RGB':
img = img.convert('RGB')
return str(imagehash.phash(img, hash_size=16))
Key parameters:
hash_size=16 → 256-bit hash (high accuracy, very few false positives)ThreadPoolExecutor (NOT ProcessPoolExecutor — native HEIF libs deadlock on fork)Expected performance: ~500 files/30 seconds on Apple Silicon, ~200 files/30 seconds on Intel.
Compare hash sets between sources:
common = set(source_a_hashes.keys()) & set(source_b_hashes.keys())
a_only = set(source_a_hashes.keys()) - set(source_b_hashes.keys())
b_only = set(source_b_hashes.keys()) - set(source_a_hashes.keys())
For internal duplicates within a single source:
internal_dupes = sum(len(v) - 1 for v in hashes.values() if len(v) > 1)
Present findings in a structured report:
DUPLICATE ANALYSIS REPORT
Library: [total] assets ([photos] photos + [videos] videos)
Sources analyzed: [Source A] ([count] files), [Source B] ([count] files)
CROSS-SOURCE DUPLICATES
[Source A] <-> [Source B] visual matches: [count] ([pct]% overlap)
UNIQUE TO EACH SOURCE
[Source A]-only photos: [count]
[Source B]-only photos: [count]
INTERNAL DUPLICATES
Within [Source A]: [count]
Within [Source B]: [count]
TOTAL REMOVABLE
Cross-source duplicates: [count]
Internal duplicates: [count]
TOTAL: [count] files
RECOMMENDATION
Keep: [Source with better metadata/folder structure]
Remove: [Other source] copies where match exists
Review: [count] [other]-only photos are NOT duplicates — keep them
NEVER auto-remove. Always:
delete_assets(asset_ids=[...], force=False) — safer, recoverable via restore_assets or restore_trash
b. Physical file removal from disk (os.remove()) only after user confirms trash is correct
c. For permanent deletion (user explicitly requests): delete_assets(asset_ids=[...], force=True) — irreversibleBatch Immich deletions in groups of 100 assets per call. For ML-detected duplicates, prefer resolve_duplicates which handles them natively in Immich.
After removal, query Immich statistics to confirm the new count and present before/after comparison.
Uses only Immich database — checksums, filenames, timestamps. Fast but misses re-encoded duplicates.
-- Exact checksum duplicates
SELECT checksum, count(*) FROM asset
WHERE "deletedAt" IS NULL
GROUP BY checksum HAVING count(*) > 1;
-- Filename overlap between sources
SELECT count(*) FROM (
SELECT "originalFileName" FROM asset WHERE "originalPath" LIKE '%Source A%'
INTERSECT
SELECT "originalFileName" FROM asset WHERE "originalPath" LIKE '%Source B%'
) t;
Scans actual files on disk. Catches re-encoded duplicates. Requires filesystem access and Python dependencies. Takes 10-20 minutes for ~40K photos on Apple Silicon.
Shows which source dominates each year — helps users understand their photo ecosystem history:
SELECT year, source_a_count, source_b_count,
CASE WHEN source_a_count > source_b_count THEN 'Source A' ELSE 'Source B' END as dominant
FROM (
SELECT extract(year from "localDateTime") as year,
count(*) FILTER (WHERE "originalPath" LIKE '%Source A%') as source_a_count,
count(*) FILTER (WHERE "originalPath" LIKE '%Source B%') as source_b_count
FROM asset WHERE "deletedAt" IS NULL
GROUP BY year
) t ORDER BY year;
pillow-heif, Apple Photos libraries will have massive error rates (50%+ of files).